Crawling large ecommerce sites
Maria Camanes, Senior SEO Consultant
● Over 6 years in SEO. Now a Senior SEO Consultant at Builtvisible,
where I joined 3 years ago
● Passionate about the technical side of SEO and specialised in site
speed optimisation and eCommerce SEO
● Work across a variety of accounts but mostly eCommerce sites
● Occasional speaker and regular trainer at BrightonSEO
● Twitter @mariacamanes
Common issues:
• A missing or wrongly implemented product retirement strategy can – and will – have a
negative impact on any ecommerce site’s organic performance
• Broken links are harmful for all types of sites but the possibility of broken links in an
ecommerce site is higher
• Discontinued or temporarily unavailable products can result in large quantities of 404s,
broken links and empty category pages (thin content)
• Displaying a 404 or empty page to your beloved customers will result in bad UX but also on
large quantities of link equity being lost
Today we’ll focus on how to find out of stock products as well as thin category pages and
- as these often occur in large quantities - how to deal with them at scale.
Maria’s tips on crawling large ecommerce sites
For example:
• This product page, has 91 backlinks from 28 different referring domains. The site has a
number of pages that return a 4xx status code with a significant number of backlinks
• As a result, it’s quite common to find large amounts of out of stock product pages for a
single site indexed by search engines
Tip #1:
Crawl your site to find out of stock products at scale
How to do it:
Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status
code (as these won’t be picked up via a standard crawl or GSC)
• Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their
identifiable out of stock is “currently unavailable”
Tip #1:
Crawl your site to find out of stock products at scale
How to do it:
Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status
code (as these won’t be picked up via a standard crawl or GSC)
• Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their
identifiable out of stock is “currently unavailable”
• Step 2: copy and paste this into Screaming Frog’s ‘Custom Search’ feature and run a crawl
Tip #1:
Crawl your site to find out of stock products at scale
How to do it:
Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status
code (as these won’t be picked up via a standard crawl or GSC)
• Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their
identifiable out of stock is “currently unavailable”
• Step 2: copy and paste this into Screaming Frog’s ‘Custom Search’ feature
• Step 3: crawl will return all of the product pages that contain the “out of stock” string. Don’t forget to manually
QA for any errors
Tip #1:
Crawl your site to find out of stock products at scale
• You can use the same process to find product listing pages that are empty (meaning they
have no products)
• Just copy the ‘no products’ identifier in Screaming Frog, in the same way we did for ‘out of
stock’ products
• Here are some examples:
Tip #2:
Apply this to category pages to find empty PLPs
Common issues:
• Thin category pages with limited stock are also a source of bad UX
• They will result in lost sales and when this happens at scale, this can have a significant
impact in revenue (not only for SEO)
• They put the site at risk of algorithm penalties
Tip #3:
Use the ‘Custom extraction’ tool to find thin PLPs
How to do it:
Taking this ASOS category page as an example
• Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class,
'styleCount’)]
Tip #3:
Use the ‘Custom extraction’ tool to find thin PLPs
How to do it:
Taking this ASOS category page as an example
• Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class,
'styleCount’)]
• Step 2: when crawl finishes, go to the ‘Custom’ tab at the top of the tool and you’ll see something like this
Tip #3:
Use the ‘Custom extraction’ tool to find thin PLPs
How to do it:
Taking this ASOS category page as an example
• Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class,
'styleCount’)]
• Step 2: when crawl finishes, go to the ‘Custom’ tab at the top of the tool and you’ll see something like this
*Note: if your page doesn’t have a container with the number of products available, you can still count the number
of elements on a page: count(//div[@class="offer__content"])
Tip #3:
Use the ‘Custom extraction’ tool to find thin PLPs

More Related Content

PDF
WordCamp Ireland - 40 tips for WordPress Optimization
PPTX
Okc wp meetup june 2019_common_wp_mistakes
PDF
Sunday Business Post SEO Masterclass - John RIng
PPTX
Taming your content with custom post types and fields
ZIP
Wordpress and Your Brand
PPT
WhiteHat SEO for Blog Owner
PDF
Too Long; Didn’t Render - The State of JS and HTML Indexing | Digital Growth ...
PDF
WordPress SEO & Optimisation
WordCamp Ireland - 40 tips for WordPress Optimization
Okc wp meetup june 2019_common_wp_mistakes
Sunday Business Post SEO Masterclass - John RIng
Taming your content with custom post types and fields
Wordpress and Your Brand
WhiteHat SEO for Blog Owner
Too Long; Didn’t Render - The State of JS and HTML Indexing | Digital Growth ...
WordPress SEO & Optimisation

What's hot (19)

PDF
Creating a Website with WordPress.org
PPTX
Creating a self hosted wordpress website from scratch
PDF
How to make JavaScript websites successful in Google | iJS 2019
PDF
Setting up a blog with WordPress.com Jan 2014 Class
PPTX
Technical SEO Terms for Advanced SEO
PPTX
Jacob Hagberg - Real World SEO Examples Every Digital Marketer Can Learn From
PPT
Wordcamp, India 2009 - How to Implement SEO on a Wordpress Blog - Wordpress S...
PPSX
Rank above smx israel 2013 - schema & rich snippets
PPTX
Basic Internet Marketing 101
PPTX
Ecommerce webinar-oct-2010
PPT
15 minutes seo audit
PPTX
Setting up a Blog with WordPress.com
PDF
Creating Your Personal Brand
PPTX
404 Pages are inevitable so make the most out of it.
PDF
WordPress SEO in 2014 - WordCamp Baltimore 2014
PPTX
PPT
WordPress SEO on Drugs!
PDF
Paid Traffic with WordPress PPC Hacks - by Peter Mead for BigDigital 2016
PPTX
On site audit with screaming frog gdi
Creating a Website with WordPress.org
Creating a self hosted wordpress website from scratch
How to make JavaScript websites successful in Google | iJS 2019
Setting up a blog with WordPress.com Jan 2014 Class
Technical SEO Terms for Advanced SEO
Jacob Hagberg - Real World SEO Examples Every Digital Marketer Can Learn From
Wordcamp, India 2009 - How to Implement SEO on a Wordpress Blog - Wordpress S...
Rank above smx israel 2013 - schema & rich snippets
Basic Internet Marketing 101
Ecommerce webinar-oct-2010
15 minutes seo audit
Setting up a Blog with WordPress.com
Creating Your Personal Brand
404 Pages are inevitable so make the most out of it.
WordPress SEO in 2014 - WordCamp Baltimore 2014
WordPress SEO on Drugs!
Paid Traffic with WordPress PPC Hacks - by Peter Mead for BigDigital 2016
On site audit with screaming frog gdi
Ad

Similar to Crawling ecommerce sites – Maria Camanes' top tips from Tea-time SEO (20)

PPTX
Seo for ecommerce sites
PDF
Webinar: Common challenges with e commerce seo optimisation
PPTX
Magento SEO - ThinkVis Sept 2012
PPTX
The On-page of SEO for Ecommerce - Adam Audette - SearchFest 2013
PPT
25 e-commerce-seo-tips
PDF
When Your Inventory Changes: SEO Tips For Changing Product Pages
PDF
7 E-Commerce SEO Mistakes & How to Fix Them #DeepSEOCon
PDF
SEO for Changing E-commerce Product Pages - How to Optimize your Online Store...
PDF
10 Ways To Increase Your Ecommerce Conversion Rate
PDF
Best Practices to Manage Discontinued Products in Your eCommerce Store.pdf
PPTX
Seo for ecommerce websites
PDF
Workshop SEO + ECOMMERCE #ECOMTEAM
PDF
Aleyda Solis — SEO for Marketplaces and e-Commerce Websites
PPTX
Emily Mace BrightonSEO Talk September 2017
PPTX
Ecommerce quick wins you can implement today to boost SEO performance
PDF
PPC, SEO & Landing Page Best Practices To Maximize Conversions
PPTX
Christmas in-july - 4 Tips for Preparing Your E-Commerce Site for the Holiday...
PPTX
Driving More Traffic to eCommerce Sites - Tea-Time SEO Series of Daily SEO Li...
PDF
Stop best practicing, start doing - Stephen Kenwright at #SearchLeeds 2016
DOCX
E-commerce strategies for business development 2015-2016
Seo for ecommerce sites
Webinar: Common challenges with e commerce seo optimisation
Magento SEO - ThinkVis Sept 2012
The On-page of SEO for Ecommerce - Adam Audette - SearchFest 2013
25 e-commerce-seo-tips
When Your Inventory Changes: SEO Tips For Changing Product Pages
7 E-Commerce SEO Mistakes & How to Fix Them #DeepSEOCon
SEO for Changing E-commerce Product Pages - How to Optimize your Online Store...
10 Ways To Increase Your Ecommerce Conversion Rate
Best Practices to Manage Discontinued Products in Your eCommerce Store.pdf
Seo for ecommerce websites
Workshop SEO + ECOMMERCE #ECOMTEAM
Aleyda Solis — SEO for Marketplaces and e-Commerce Websites
Emily Mace BrightonSEO Talk September 2017
Ecommerce quick wins you can implement today to boost SEO performance
PPC, SEO & Landing Page Best Practices To Maximize Conversions
Christmas in-july - 4 Tips for Preparing Your E-Commerce Site for the Holiday...
Driving More Traffic to eCommerce Sites - Tea-Time SEO Series of Daily SEO Li...
Stop best practicing, start doing - Stephen Kenwright at #SearchLeeds 2016
E-commerce strategies for business development 2015-2016
Ad

More from Builtvisible (20)

PDF
Webinar: How to benefit from changing consumer demand
PDF
GA4 Mini Training Webinar Deck.pdf
PPTX
Webinar: How and why to use social media to inform creative content
PPTX
Webinar: How to supercharge local SEO strategies for multi-location businesses
PPTX
How to prepare for Google's page experience update
PPTX
Optimising your faceted navigation to target long-tail keywords
PDF
Checking google index status at scale
PDF
How to build a flexible content strategy
PDF
How to make change happen in your organisation by talking your devs language
PPTX
Google for jobs – Matt Hunt's top tips from Tea-time SEO
PDF
Reducing site speed - Rachel Costello's top tips from Tea-time SEO
PDF
Webinar: Turn browsers to customers with product page improvements
PDF
Building a culture of measurement: PR Week Breakfast Briefing
PDF
Getting PR Onside with Data | SearchLove 2018
PDF
PPC Cost Analysis | Search Marketing Summit Australia 2
PDF
Addressing Site Quality | Search Marketing Summit Australia
PDF
SEO for Faceted Navigation | Get STAT City Crawl
PDF
Google Tag Manager Can Do What? | SMX London
PDF
Site speed for in-house marketers | BrightonSEO
PDF
Unlocking insights with Google Tag Manager | Tom Bennet | LearnInbound, March...
Webinar: How to benefit from changing consumer demand
GA4 Mini Training Webinar Deck.pdf
Webinar: How and why to use social media to inform creative content
Webinar: How to supercharge local SEO strategies for multi-location businesses
How to prepare for Google's page experience update
Optimising your faceted navigation to target long-tail keywords
Checking google index status at scale
How to build a flexible content strategy
How to make change happen in your organisation by talking your devs language
Google for jobs – Matt Hunt's top tips from Tea-time SEO
Reducing site speed - Rachel Costello's top tips from Tea-time SEO
Webinar: Turn browsers to customers with product page improvements
Building a culture of measurement: PR Week Breakfast Briefing
Getting PR Onside with Data | SearchLove 2018
PPC Cost Analysis | Search Marketing Summit Australia 2
Addressing Site Quality | Search Marketing Summit Australia
SEO for Faceted Navigation | Get STAT City Crawl
Google Tag Manager Can Do What? | SMX London
Site speed for in-house marketers | BrightonSEO
Unlocking insights with Google Tag Manager | Tom Bennet | LearnInbound, March...

Recently uploaded (20)

PDF
SEO Is Alive: Real Data That Kills the Internet Hysteria - Sid Lal, Bruce Cla...
PDF
BETRIMEX market penetration proposal - GROUP 3 - CANADA.pdf
PPTX
Best Machine & AI Company in India - Digital Navik
PDF
AYODHYA OUTDOOR MEDIA PLAN - SRI GARIMA PUBLICITY PRIVATE LIMITED
PDF
EYP Creation Presentation Deck - Offerings
DOCX
Creative Marketing Campaigns in Milton Keynes
PPT
david_sm13_ppt_01.ppt MARKETING MANAGEMENT
PDF
Social Media Portfolio - Bibin Alexander
PDF
ShoutEx Startup Marketing Playbook 90 days.pdf
PPTX
DOC-20241015-WA0008. (1).pptx hotel management
PDF
Betrimex market penetration- Canada - Group 3.pdf
PDF
LESSON 01 - TOPIC 02. Role of Information in Organizations.pdf
PDF
GEO vs SEO: Maximizing Engagement with LLM
PPTX
IMC Bimtech --------------------------.pptx
PDF
domain and Hosting by mayank adhikari ppt
PDF
5 Hacks To Help You Scale Your Business - Adrian Falk
PPTX
10-STRATEGIC-MANAEGEMENT marketing .pptx
PPTX
Secure India Summit 2025 – Awards Nomination Form 1.pptx
PPTX
From SEO to GEO The Future of Discovery in 2025
PPTX
The principles of Marketing Environment 2.pptx
SEO Is Alive: Real Data That Kills the Internet Hysteria - Sid Lal, Bruce Cla...
BETRIMEX market penetration proposal - GROUP 3 - CANADA.pdf
Best Machine & AI Company in India - Digital Navik
AYODHYA OUTDOOR MEDIA PLAN - SRI GARIMA PUBLICITY PRIVATE LIMITED
EYP Creation Presentation Deck - Offerings
Creative Marketing Campaigns in Milton Keynes
david_sm13_ppt_01.ppt MARKETING MANAGEMENT
Social Media Portfolio - Bibin Alexander
ShoutEx Startup Marketing Playbook 90 days.pdf
DOC-20241015-WA0008. (1).pptx hotel management
Betrimex market penetration- Canada - Group 3.pdf
LESSON 01 - TOPIC 02. Role of Information in Organizations.pdf
GEO vs SEO: Maximizing Engagement with LLM
IMC Bimtech --------------------------.pptx
domain and Hosting by mayank adhikari ppt
5 Hacks To Help You Scale Your Business - Adrian Falk
10-STRATEGIC-MANAEGEMENT marketing .pptx
Secure India Summit 2025 – Awards Nomination Form 1.pptx
From SEO to GEO The Future of Discovery in 2025
The principles of Marketing Environment 2.pptx

Crawling ecommerce sites – Maria Camanes' top tips from Tea-time SEO

  • 2. Maria Camanes, Senior SEO Consultant ● Over 6 years in SEO. Now a Senior SEO Consultant at Builtvisible, where I joined 3 years ago ● Passionate about the technical side of SEO and specialised in site speed optimisation and eCommerce SEO ● Work across a variety of accounts but mostly eCommerce sites ● Occasional speaker and regular trainer at BrightonSEO ● Twitter @mariacamanes
  • 3. Common issues: • A missing or wrongly implemented product retirement strategy can – and will – have a negative impact on any ecommerce site’s organic performance • Broken links are harmful for all types of sites but the possibility of broken links in an ecommerce site is higher • Discontinued or temporarily unavailable products can result in large quantities of 404s, broken links and empty category pages (thin content) • Displaying a 404 or empty page to your beloved customers will result in bad UX but also on large quantities of link equity being lost Today we’ll focus on how to find out of stock products as well as thin category pages and - as these often occur in large quantities - how to deal with them at scale. Maria’s tips on crawling large ecommerce sites
  • 4. For example: • This product page, has 91 backlinks from 28 different referring domains. The site has a number of pages that return a 4xx status code with a significant number of backlinks • As a result, it’s quite common to find large amounts of out of stock product pages for a single site indexed by search engines Tip #1: Crawl your site to find out of stock products at scale
  • 5. How to do it: Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status code (as these won’t be picked up via a standard crawl or GSC) • Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their identifiable out of stock is “currently unavailable” Tip #1: Crawl your site to find out of stock products at scale
  • 6. How to do it: Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status code (as these won’t be picked up via a standard crawl or GSC) • Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their identifiable out of stock is “currently unavailable” • Step 2: copy and paste this into Screaming Frog’s ‘Custom Search’ feature and run a crawl Tip #1: Crawl your site to find out of stock products at scale
  • 7. How to do it: Use the ‘Custom search’ feature in Screaming Frog to find these when they return a 200 status code (as these won’t be picked up via a standard crawl or GSC) • Step 1: find the identifiable out of stock copy on your product page. Using Amazon as an example, their identifiable out of stock is “currently unavailable” • Step 2: copy and paste this into Screaming Frog’s ‘Custom Search’ feature • Step 3: crawl will return all of the product pages that contain the “out of stock” string. Don’t forget to manually QA for any errors Tip #1: Crawl your site to find out of stock products at scale
  • 8. • You can use the same process to find product listing pages that are empty (meaning they have no products) • Just copy the ‘no products’ identifier in Screaming Frog, in the same way we did for ‘out of stock’ products • Here are some examples: Tip #2: Apply this to category pages to find empty PLPs
  • 9. Common issues: • Thin category pages with limited stock are also a source of bad UX • They will result in lost sales and when this happens at scale, this can have a significant impact in revenue (not only for SEO) • They put the site at risk of algorithm penalties Tip #3: Use the ‘Custom extraction’ tool to find thin PLPs
  • 10. How to do it: Taking this ASOS category page as an example • Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class, 'styleCount’)] Tip #3: Use the ‘Custom extraction’ tool to find thin PLPs
  • 11. How to do it: Taking this ASOS category page as an example • Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class, 'styleCount’)] • Step 2: when crawl finishes, go to the ‘Custom’ tab at the top of the tool and you’ll see something like this Tip #3: Use the ‘Custom extraction’ tool to find thin PLPs
  • 12. How to do it: Taking this ASOS category page as an example • Step 1: fetch the contents of any container using class=“styleCount” by typing this: //*[contains(@class, 'styleCount’)] • Step 2: when crawl finishes, go to the ‘Custom’ tab at the top of the tool and you’ll see something like this *Note: if your page doesn’t have a container with the number of products available, you can still count the number of elements on a page: count(//div[@class="offer__content"]) Tip #3: Use the ‘Custom extraction’ tool to find thin PLPs