I joined Aleyda Solis, Paul Shapiro and Paige Hobart to discuss some of the worst technical SEO nightmares that websites face, the best ways to fix them fast, and how to avoid them.
Part of the SEMrush 5 Hours of Technical SEO webinar. You can watch the full recording on the SEMrush website, Aleyda’s presentation is available to download.
SEMrush is an all-in-one tool suite for improving online visibility and discovering marketing insights. The tools and reports can help marketers that work in the following services: SEO, PPC, SMM, Keyword Research, Competitive Research, PR, Content Marketing, Marketing Insights, Campaign Management.
You can try SEMrush Pro for 14 Days
What are you waiting for? See what SEO opportunities are out there for you.
Table of contents
- Introduction
- Website Migration Issues: Problems and Fixes
- International Websites and Redirect Errors
- Hreflang Tags: Technical Issues
- SEO Technical Issues from Live Testing Environments
- Technical Errors from UX Testing
- The Importance of Internal Coordination Between Teams
- Newer Technical SEO Nightmares: JavaScript and More
- Future Technical SEO Nightmares: Mobile-First Indexing Problems
- Preventing PR Disasters from Technical SEO Errors
- Web Migrations: Strategy and Useful Metrics
- Preventing Technical SEO Issues as a Beginner
Introduction
Paige Hobart: If you’re joining us now, you are now going to have the wonderful Aleyda, Paul, and David to take you on a journey to the worst technical SEO nightmares and how to end them. I think this is going to be an enjoyable session. Guys, if you want to introduce yourself first, and then Aleyda can introduce herself and jump straight into the deck.
Paul Shapiro: I’ll start since I’m on the left. Hi, everyone, Paul Shapiro. I lead technical SEO for Conde Nast and all the Conde Nast brands, so that’s publications such as GQ, Vogue, Wired, New Yorker. Some people may have heard of them.
I founded the first technical SEO conference in the US, TechSEO Boost. I’m still involved with that. When the pandemic is over, we’re certainly going to have a 2021 event.
David Sayce: Cheers. I’m David Sayce. I’m a digital marketing and SEO consultant with Paper Gecko. I deal with all sorts of issues. I’ve been doing this since about ’96, ’97, so a few years of nightmares. Really looking forward to seeing this.
Aleyda Solis: My name is Aleyda Solis. I’m so thrilled to be able to share with you a few nightmares, my nightmares that I’m sure that some of you have already also gone through, unfortunately. Today, what we want to do is to be able to share some tips, some insights, inputs of how you can at least control those nightmares, so they don’t consume you. That is what is important. They will tend to happen from time to time, but we need to take control of them.
I am an international SEO consultant. I have my own SEO consultancy, that is called Orainti. I am also very active on Twitter; I am @aleyda, in case you don’t follow me. I like to share what I learn and what I do in SEO. I have a newsletter that is called SEO FOMO. I also have a video series that is called Crawling Mondays, where I interview other SEOs across different areas to different topics.
So, the worst technical SEO nightmares and how to end them, right? Literally, this is also known as “How to avoid technical SEO f**k-ups” that can send your work to the garbage and could end up also making you lose your job, which is the actual nightmare, and what we want to avoid, definitely.
The journey of technical SEO today can easily turn into an unexpected nightmare, unfortunately. Last year, I asked around over Twitter, and I got a little bit more than 500 answers regarding which were the top causes of SEO failure, of SEO process failure.
Not so very shockingly, technical SEO resources, availability, and reliability issues were the top issues, and the fourth one was technical flexibility issues. We know that technical, flexibility, resources, capacity are among those very persistent types of problems that we tend to come across in an SEO process on the one hand.
Then, on the other hand, we know that things have gotten more complex in the latest years, with the popularization of Javascript frameworks, and also many new companies.
This tends to come from an app-first type of work. They tend to literally start building their website after their app and…they’re cumbersome with the Javascript frameworks again. Even in last year, one of the Google events for the search community, that they acknowledged and they shared how the current status of search needs more technical-oriented SEOs because of all of these layers of complexities.
Website Migration Issues: Problems and Fixes
Today, I will go through those most common technical SEO nightmares that I, not only I have seen in my own SEO scenarios, in my own projects, but people have confirmed that they tend to see again and again as the worst ones.
Let’s start with the classic ones, those that they already have a small share in our mind every night, like web migrations. Web migrations, they can go very wrong. Literally, there’s a thing, huge brands even tend to implement migrations like this, and then they go wrong.
Funnily enough, it was like a few months ago when I was talking about SEO for e-commerce websites. I was featuring Nordstrom navigation as something that you need to avoid because they were internally linking to the top categories and some facets and main navigation.
Tagging the URLs, adding parameters to the URLs, I guess to tag or to monitor the behaviour of the clicks, that can be done in many other ways, not necessarily events, for example, without relying on using parameters in the URLs. You really want to link to the actual indexable URL version, of course, especially in your main navigation.
And not only major brands, everybody. Sometimes we come across these scenarios in which what can go wrong goes wrong like Matthieu here. It was after international migration and money checks: disallow of new language directory; Gbot UA was blocked; nofollow links, URL parameters on each redirect URL with canonical to the final URLs; a few days later, robots.txt updated blocking the same parameters. Many things can go wrong. It’s crazy, right?
This is another scenario that I thought was interesting because it was not the typical migration of the URL structure, web structure or even design, but it was a server migration. Gianna here was told that somehow they were relying on a plug-in in their platform that all of a sudden, just because of the server migration, the cache didn’t refresh, and it made it erroneously redirect and ended up serving a blank page every single time to the Googlebot and the user, too.
I’ve created a checklist of Google Sheets that you can copy, of course. This is my checklist for the minimum viable SEO validation whenever a web migration happens. Also, I have added some extra columns to the right, with the status, and dates, and comments.
Because, ideally, this is the thing, we should do this very well before we need to do the migration to be able to share it with all of the team members involved, with the stakeholders and go through all of this.
I have also shared many times one of those from Crawling Mondays videos how to recover from when a web migration goes wrong because this tends to happen, too. We very specifically prioritized tasks to go through all of these URLs that are not redirecting or not correctly redirecting to the indexable new version of each former page.
It’s not only with URLs or server migrations, even redesigns and structural web changes we can see this type of issues. Dawn here was sharing a very, sadly, common situation. “The dev team redirected every single vehicle page on an auto site to a lifestyle page with absolutely no words when reskinning a site.” Needless to say, that didn’t go very well.
This is the thing, I have found that sometimes this happens when pruning the site, sometimes the web development team or some of the content team members also decide like, “Oh, very old content, nobody’s reading it, no traffic. We can eliminate it.”
But they haven’t noticed that they have like thousands of backlinks from back in the day, or this is literally bringing non-trivial organic search rankings, or referring to some of the most important pages internally cross-linking to the website.
Then, Andrew Optimisey here, he also shared another very crazy horror story regarding removing content on the site. Even if we don’t think that this is necessarily a technical type of issue, it is because you need to validate what needs to show a 410, what needs to show 301 redirects and where, and if the new pages’ destinations are indexable, they are crawlable, they keep the relevance from the former page if they will fulfil the intent or not.
So, yeah, it’s not purely content, it also has a technical type of layer in this type of validation that is important to take into consideration very well. And for that, I have shared and created this flow chart in the past to go through different scenarios to facilitate the decision making of what to do with the pages.
This is the situation here that Claire shared, when you are a franchisee, and they have you as a subfolder of the main domain and head office, they decide to eliminate you. Like that, right? No redirect. No anything.
So, what can you do? What is important here is to understand that these things can happen. This is the thing; fundamentally, there are planning and alignments with stakeholders and decision-makers before doing these changes. It’s critical, fundamental for success.
International Websites and Redirect Errors
Another classic for SEO horror stories happening with international websites in which, unfortunately, I have also been very involved, when automatically redirecting users and bots based on their IP locations. I mean, I understand that there might be some use cases here. For example, if you have a forex website and you literally cannot show some pages to the US users, only to the European based ones; things like that.
You lose your rankings because if you are, for example, redirecting your US users to only this domain, then, still today, most of the IPs that Google uses to crawl the web are from the US. Maybe they will never see your version for France or Germany because they don’t crawl the web with French or German IPs, unfortunately. If you need to justify this decision, you can see official recommendations from Google here, and even John Mueller here, about not doing this.
On the other hand, what you can do is to suggest the best version for your users, as we can see here, Adidas is doing, in a non-intrusive way, ideally. And then, whenever the user actually selects their preferred version, you can personalize the redirect to that particular user via cookies. But this is a personalized type of redirect.
Hreflang Tags: Technical Issues
Another very usual SEO horror story for international websites comes with the usage of hreflang annotations that I don’t know why or how that it’s crazy how they can go bad. It was Fabrizio sharing the other day the hreflang annotation for the Stripe website is like…es-419, which is supposed to be Spanish for Latin America, but literally, hreflang annotations, they don’t support localizations like this. It is Spanish, and that’s it: Spanish. Hreflang annotations serve to specify what are the alternates of a given page, the pages that exist in all the languages and targeting to other countries.
For example, in this particular case that I’m showing here when I search for these Nike shoes here from the US, I should be able to see the US version. But something is happening here, and instead of seeing the US version I see the GB, the UK version, I see the Luxembourg version, I see the Canadian version for this page.
Maybe if it is because they don’t have a US page for these shoes, then it’s great, there’s nothing wrong, there’s nothing to fix. But if there is an actual US page for this particular shoes, this is what should be fixed here, and that is what or how hreflang annotations are useful for this particular scenario because they will be able to point to the US page for the shoes effectively to be shown here instead of all of these other versions for other countries.
I have also created a flowchart here to facilitate you to decide if it is worthy or not to use hreflang annotations in the first place. For example, you’re not meant to use hreflang annotations, first, if you have these pages only in a single language or country. There’s no purpose to use annotations because the purpose there is to specify which are your alternate versions; if you don’t have alternate versions, you shouldn’t use hreflang annotations.
I see a lot of people using them or pointing to pages that are not meant to be indexable anyway. If they are not meant to be indexed, there’s no purpose for you to waste your time for those.
Another scenario is trying to implement hreflang annotations for every single URL or a huge website with millions of URLs highly dynamic; they are changing all the time because maybe there are e-commerce ones.
In this particular case, it’s about prioritizing, which are the more static pages that generate issues again and again and again. Potentially, this won’t be the very highly dynamic product pages that are changing all the time and almost don’t even have time to rank or not necessarily the ones that are bringing the traffic.
To do this type of analysis of which are the worthy pages that are actually ranking in the non-relevant market that you should prioritize to implement hreflang annotations you can use SEMrush, for example or Google Search Console data, or Google Analytics data to see which are the ones attracting traffic from countries that they shouldn’t be because there are other pages meant to run for these countries.
I also created this Data Studio Report to easily identify this, applying certain filters. It’s a step-by-step Google Data Studio that you can copy-paste from this URL that I am leaving below. For the values, you can use the hreflang tag generator here that is completely free.
Realistically, if you use some of the most common CMSs out there, WordPress or Magento or Shopify, they also support automated hreflang implementation directly or with plugins or extensions. This should be very straightforward to configure and prioritize, which are the pages that need hreflang annotations in the first place.
And if you don’t have this type of support, then you have the hreflang Builder. That is a paid tool, but it’s very, very well worthy for an enterprise-level type of websites that will allow you to generate complex XML sitemaps with hreflang annotations in a very straightforward way.
SEO Technical Issues from Live Testing Environments
Finally, and the last classic scenario for SEO horror stories that will cause horrible nightmares are web releases and coordinating tests. This is the thing; this requires a lot of excellent coordination with the web development team. I literally just picked top ones; but leaving the test environments crawlable, to be crawled, in general, it’s unfortunately still widespread.
One of the top things that I do when I start with a new SEO process right there is asking you, is there a subdomain that you use for testing purposes? Or do you have any open version of the website that is crawlable? I go directly and check with SEMrush, are there any subdomains out there beyond the www that looks like a dev version of the website?
There are so many ways to avoid this from happening. The best way though is, please don’t rely only on the robots.txt on that subdomain because we know that Google will tend to overlook it if they see that someone is linking to that page, unfortunately, and then the pages might be indexable because they try to replicate what they will see in the production environment. The best is to rely on HTTP authentication.
We must monitor this very, very closely to avoid overwriting the meta robots, canonical tags, and robots.txt when we release to the live site. This is why before and after our release, it’s important to test, not only in the dev environment but also after the release when it has been launched on the production one.
Even if you have literally tested this 10 minutes before on dev one, check again, revise again, crawl again in the production one even if you have correctly sent the code literally to be copy-pasted by the developer.
Technical Errors from UX Testing
And then, of course, if any UX test is happening. This is a good one from Arnold. When you are doing any A/B or multivariate tests, be careful with the redirects, be careful with the canonicalization, or the no-indexation of a few of these pages. Rachel Costello did a really, really good article covering the main scenarios of the configurations that can go wrong and how you can test that this doesn’t happen when doing A/B and multivariate testing within your organization.
And then, of course, when someone asks, because of analytics and testings of these type of scenarios, it’s important to work not, only with your web development team, but with your design UX team. If they want to track the user behaviour test, the user behaviour, there are always ways to do this in a way that doesn’t mess up the SEO.
For example, if they want to track internal links, don’t use parameters, the events. There are other types of methodologies that you can leverage that is integrated with analytics tools that won’t cost the mess that using UTM parameters will.
The Importance of Internal Coordination Between Teams
Another proof of why we need to align with all the areas here, not only with the design and web development, of course, when building someone was attracting tons of very, very valuable links, backlinks and they were going to a page that was canonicalized to another one, one to the agency that was working for the website. That shouldn’t have happened. It’s very, very important to coordinate well, to align well with other teams.
Or also when cleaning up the site after being hacked. “Oh, we are victims from spam or some hacking going on. What do we do?” In the meantime, we disallow so Google doesn’t come and they see that we have been hacked.
There are ways to handle this, don’t do this. Try to remove, try to clean as soon as possible, but please don’t block your website like this. You will stop ranking very quickly, and all the effort that you have done will go to the garbage, I’m afraid.
The purpose here is that you see SEO as an integrated part within your company product triangle. I developed this visualization that you see here on-screen on an article a few months ago when I was analyzing and going through what needs to be next and how my requirements, my recommendations align with all the features, functionality, efforts, business or sales efforts of the website.
Those that have more to do with technical challenges and technical requirements, those that have to do with the business side of things, and those that are also in the process already from all the marketing of growth areas, and how do we align each other to leverage every single opportunity.
And to know that what I am prioritizing makes also sense to other areas, how I can win and earn support from other areas. This is why a good alignment is important.
Newer Technical SEO Nightmares: JavaScript and More
What about the new technical SEO nightmares? A very typical one, a very new one, very hot one: Javascript frameworks. Things can really, really go wrong and I got quite a few of these. We know that when you use Javascript frameworks, things tend to take longer to load. And that is a problem; you need to be very, very fast.
If your website has millions of URLs, again things can go wrong very, very quickly. It’s important to try to avoid as much as possible, relying on Javascript when you can very well do straightforward HTML. And, of course, try to follow web standards as much as possible.
Another horror story here from Pedro literally with links that were not implemented with href tags and, of course, no href values. How will the Google bot be able to crawl that, right? No way. Here, Barry was sharing his own scenario: “A crawl without Javascript rendering showed 40k pages, but with (rendering) 400k.” Yeah, some essential on-page internal links were client-side Javascript.
Here, again we can see how this crawlability type of issues because relying too much on Javascript can make or break. This is completely unnecessary in 2020. There are very well specified guidelines from Google at this point.
There are also so many tools out there, even free ones. For example, this one here from search view, they literally allow you to quickly check what is rendered using Javascript or not, the gap from the raw HTML versus the rendered DOM. You can see the gap in everything from every page, and you can do it so from a mobile and desktop perspective to crawl too. It tells you the titles, descriptions, canonical tags, content links, what changes on the website from one or the other.
Also, at this point, most SEO crawlers will have the capacity of crawling with and without Javascript and then can compare and see the gap. I love this feature from Sitebulb, for example, because they literally specify which are the links that you crawl with them that is whether created or changed with Javascript.
They check the difference between the DOM and the raw HTML. It’s so convenient because of how they show it for every page and also add an aggregated basis with all of the links that they crawl, allowing to easily compare between crawls like this, as I’m showing here. It should be very, very straightforward.
Future Technical SEO Nightmares: Mobile-First Indexing Problems
With the nightmares, finally, to finish wrapping up here, that we expect to come. We should be prepared at this point already for the mobile-first indexing for which we have had a little bit of extra time. It was delayed already, and now it’s finally going to happen in May 2021.
Google again, in this case, have done a really great work documenting what we should be doing as a best practice to avoid bad things from happening. Literally, it’s pretty much on making sure that our website is crawlable to the mobile user-agent, on the one hand to the mobile Google bot; and then on the other hand, that we show the mobile Google bot all the content that we were showing to the desktop one.
They have been expanding their guides, also covering many additional use cases. For example, with lazy loading on mobile versions here, making sure how we display the images, that the images are actually crawlable here, that we’re still linking these images, and if we rely on Javascript, to render them to show them, et cetera, that we make sure that they are actually accessible, crawlable from the HTML, things like that.
To make sure, go to the Google Search Console, see which is the primary crawler that is being used for your website. Don’t assume. Assumptions are the mother of f**k-ups. Go and take a look at the primary crawler.
There are so many tools out there. Literally, we can see here how this particular tool in technicalseo.com, this is the mobile-first index tool; we can go and use it to validate the pages for free and see the gap between mobile and desktop in a very straightforward way.
These types of scenarios are not necessarily a knowledge issue but an execution challenge. How do we really avoid this? Talking with people and coordinating with people. This relies on people, as Areej very well said in this presentation here last year on Brighton SEO. Avoid f**k-ups by establishing a healthy SEO framework; SEO understanding across teams by doing training for the different areas involved in the SEO process. Consistent SEO checklist to be used across teams whenever there’s a change on the website.
I shared a checklist here that you can use to implement automated validations with the web platforms to follow SEO best practices as much as can be integrated directly within the platform the best will be. Recurring/ongoing SEO tech/content manual validation pre and post web release with crawls.
Recurring/ongoing SEO tech/content automated validation pre and post releases with schedule crawls too. Most of the tools allow us to do this too. Automated monitoring with alerts for technical configuration and content web change, ContentKing and Little Warden are amazing for this too.
Thank you very much for the opportunity. I hope that I have covered most of those SEO nightmares and issues. I hope it is useful too and looking forward to having a conversation about them now.
SEMrush Webinars
Preventing PR Disasters from Technical SEO Errors
Paige Hobart: Wonderful. Wonderful. Thank you. We’ve got some time for some questions, so I’m going to get right to it. I thought there was quite a fun one that I think Paul’s got an answer to already. Simon Cox says, “On the news today a French news website moved CMS and published a set of draft obituaries of famous people. How do you recover from something like this?“
Paul Shapiro: Yeah, I think I was a little bit cheeky. You can get that removed from the index, right? You send it to a 404; you try to get Google to crawl it, you make sure that that’s not crawled again, you properly block that, you index that, you 301 it, whatever. You add a robots directive after you get it removed from the index, not before, because you don’t want it not to be crawled before it’s blocked.
Aleyda Solis: No, it’s funny because, literally, if you make use of the temporary content removal feature in the Google Search Console, that is handy when some content that shouldn’t be made available in the first place is made, that actually works very quick, I have to say.
David Sayce: The problem with those is the bigger the mess up the quicker somebody is to take a screenshot.
Web Migrations: Strategy and Useful Metrics
Paige Hobart: Oh, yeah, absolutely. On a more serious note, I’ve got a nice question from Monica Wong, “I’m going through a web migration, what metrics do I measure?” A lot of these issues were migration-related, weren’t they?
Aleyda Solis: Yes, indeed. Well, I think that it depends on the goal of the migration. This is the thing. The first question that I will ask when a migration is supposed to happen is like, what do you want to achieve with the migration? Is the migration actually needed to achieve that goal that you want in the first place? Because, yeah, let’s try to actually make migrations happen when they are really needed.
It’s important, first, to track and monitor this type of very particular KPIs because of the goal that we’re trying to achieve for that particular migration. And then, of course, rankings, traffic, conversions, the URLs that were before and after crawl and index to, I will say that the KPI should be at the different layers of the work from a credibility perspective.
And of course, the most important rankings for our most important queries or pages, we should see that, not only that we recover the traffic and that’s it, but if the pages that are actually meant to be ranked and that we’re ranking before are the ones ranking now.
Paul Shapiro: Yeah. And to add to that, I think it’s worth asking the question of what kind of migration? There are actually many types of migrations that can occur. You could simply be moving onto a new CMS; you’re retaining all the same URLs, you could be like redesigning the pages, you could be moving onto new URLs, new CMS.
David Sayce: Or they could be all of them.
Paul Shapiro: There are many different forms, and sometimes the KPIs differ based on that. If it’s simply moving to a new UI, then you need to look at the engagement.
David Sayce: Yeah. I mean, it’s funny you mentioned the various types of migration. The other issue is when business is trying too hard to migrate everything at once, and so often I’m seeing that being a problem. It’s the, “Well, we’re going to re-platform, so let’s update the design. Hey, while we’re doing that, let’s change these category pages over here and update this.”
It just becomes too much. The key really is, with the metrics, get an excellent snapshot of where you are at the moment because, believe me, you’re really going to need that if things get messed up later on.
Aleyda Solis: Even with Screaming Frog, like every single tool nowadays allows you to take a snapshot of the rendered page so we can see how it was, we can see how the main configuration of the main pages that were crawlable in the first place before, where, and we can compare the before and after, that is something very, very important.
Again, try to avoid as much as possible too many things at the same time because then it’s much more difficult to isolate issues and see what has impacted.
David Sayce: It’s the ability to understand where the mistakes are being made. And that’s why having that gap between any major changes makes such a difference. But, yeah, funnily enough, I was on Wayback Machine today trying to solve various issues. It’s always more interesting when you’ve been involved in a project from the beginning, but we don’t often work like that.
Preventing Technical SEO Issues as a Beginner
Paige Hobart: “What would you recommend is the most important thing someone starting out in SEO should do/avoid to avoid these big SEO nightmares?”
Paul Shapiro: We talked about making use of checklists, and those are certainly invaluable whether they’re a public-facing checklist or your own private checklist. But something I would be wary of is making sure that each item on that checklist is detailed enough and not in conflict with one another. Really think about how those different items are interacting with one another and whether they reflect the full picture.
Aleyda Solis: With a checklist, it’s critical to specify the SEO scenarios and give examples. That makes a lot of difference. Be as specific as you can be.
David Sayce: I think education really does go a long way. That’s something, with a lot of the projects I work on, it’s there from the very beginning. One of the other things is, when you have one of these wonderful technical SEO issues, nightmares, it’s important to have, you know, those checklists are great to fall back on, but the main thing is not to panic.
I’ve seen this quite a few times, that the technical SEO nightmare starts and then it just has this snowball effect. You have senior stakeholders screaming at SEOs, digital marketers, “What on earth is going on?” and panic sets in. I think having some of those checklists in place and just having the sense of mind not to go into blind panic is quite vital with the technical SEO side.
Paige Hobart: Thank you, everybody, that’s posted your questions, I appreciate it. There’s a lot that we’ve not got around today, but thank you all.
Technical SEO refers to the optimisation of a website to be crawled by a search engine, making sure a website is accessible and can be crawled without issues.
A technical SEO audit is a process to uncover opportunities to improve the visibility of a website in organic search
Technical issues on a website can send a signal to search engines that the website is not a quality website; these issues can also make it difficult for visitors to use your website.
Primarily these are around the crawling and indexing of the website. Important areas to review include use of HTTPS, Redirects, HREFLANG, Sitemaps, Schema Markup and Speed
00:00:00 Martha van Berkel, Paige Hobart, Jeffrey Burns, Arnout Hellemans — Schema Markup Explained: 10 Complicated Concepts Made Simple and Actionable
01:01:30 Aleyda Solis, Paige Hobart, David Sayce, Paul Shapiro — The Worst Technical SEO Nightmares and How to End Them
01:58:45 Jes Scholz, Paige Hobart, Liraz Postan, Andrew Coco — Enterprise Site Pagination Faux Pas
02:29:40 Bartosz Góralewicz, Nik Ranger, Cindy Krum, Will Critchlow — The Real Problems Behind Indexing
03:30:13 Ric Rodriguez, Nik Ranger, Dawn Anderson, Duane Forrester — How Knowledge Will Define The Future Of Search — And What To Do About It Now
03:59:03 Kristina Azarenko, Nik Ranger, Ulrika Viberg, Vahan Petrosyan — JavaScript, SEO and Dollhouses
04:30:20 Jamie Alberico, Nik Ranger, Jeff Coyle, Kevin Indig — How to Leverage Insights from Your Site’s Server Logs