How to Tell If a Page Is Blocked by robots.txt

What we cover

Most people have never looked at their robots.txt file.

That’s fine.
But it can quietly block your pages from Google.

And when that happens, you can write the best page on the internet and it still won’t matter. Google can’t rank what it can’t access.

What is robots.txt (In Plain English)?

robots.txt is a simple text file that lives on your domain and tells search engines what they are allowed to crawl.

Think of it like a bouncer at the door.

If the bouncer lets Google in, Google can read your pages.
If the bouncer says “no,” Google can’t even see what’s inside.

Important note:
robots.txt controls crawling, not indexing. But if Google can’t crawl the page, it’s not going to understand it, refresh it, or rank it reliably.

Step-by-Step: How to Check If robots.txt Is Blocking a Page

Step 1: Open Your robots.txt File

In your browser, go to:

https://yourdomain.com/robots.txt

What you’ll see:

plain text
no formatting
usually just a few lines

If the file doesn’t exist (404), that’s okay.
No file usually means nothing is blocked.

Step 2: Look for Disallow Rules

You’re looking for lines like:

Disallow: /something/

This tells search engines:

“Do not crawl anything that starts with this path.”

Examples:

Disallow: /private/ blocks yourdomain.com/private/anything
Disallow: /test/ blocks yourdomain.com/test/page-1
Disallow: /staging/ blocks yourdomain.com/staging/whatever

Quick test

If your page URL begins with a “disallowed” path, it’s blocked.

Example:
If robots.txt says:

Disallow: /blog/

Then anything under:

yourdomain.com/blog/
is blocked from crawling.

That’s usually not what people intend.

Step 3: Use Google Search Console to Confirm

This is how you verify it without guessing.

Open Google Search Console
Paste the full page URL into the top search bar
Press Enter
Open URL Inspection results

If it says:

“Blocked by robots.txt”

That’s your answer.

Common Things That Get Blocked by Accident

These show up constantly:

/wp-admin/ (fine to block)
/private/ (fine if intentional)
/test/
/staging/
Old directories that no longer matter

Entire site blocks like:

Disallow: /

(This blocks everything. Great for staging. Catastrophic for production.)

What To Do If You Find a Block

Ask one question:

Should Google be able to crawl this page?

If YES (it should be crawlable)

Remove or narrow the disallow rule
Save the file
Then go back to Search Console and Request Indexing for the URL

If NO (it should stay hidden)

Leave it blocked
Congrats, your site is behaving like it should

How to Think About This (POV)

Here’s the part people get wrong.

They treat robots.txt like a ranking lever. It’s not. It’s access control. It’s a gatekeeper. And gatekeepers are not subtle.

If a page is blocked, you don’t “optimize” your way out of it. You either open the gate or you don’t.

Also, blocking is not automatically bad. Your site should have areas Google never needs to crawl. Admin paths. Internal tools. Testing directories. A clean robots.txt is a sign your site has boundaries.

The real danger is accidental blocks that sit there for months because nobody thinks to check the file. That’s why this Lab matters. It’s not sexy. It’s not advanced. It’s just one of those small things that can quietly make your SEO feel broken.

Blocking is powerful.
Use it intentionally.