Skip to content

Indexable helps search engines and AI understand your React Site better.

Notifications You must be signed in to change notification settings

Amit00008/indexable

Repository files navigation

React Indexable

Serving the same content to humans, search engines, and AI crawlers the right way.

npm version npm downloads License: MIT


The Problem

When we build websites, we think about human users first design, images, interactions, and layout. But today, the web is read by more than just humans:

  • Search engines crawl pages to index and rank them
  • AI crawlers read content to understand, summarize, and reuse it

The problem is that humans, search engines, and AI crawlers do not read the web the same way.

What Goes Wrong?

A typical web page contains images, buttons, layout containers, and JavaScript-driven interactions. This works great for humans, but machines care about content clarity, not UI.

  • Search engines must parse unnecessary UI elements before reaching core text
  • Important information may be hidden behind JavaScript interactions
  • AI crawlers struggle to extract clean, structured information
  • Educational content, tutorials, and Q&A become harder to interpretpret

The content exists, but machines don't read it efficiently.


The Solution

Indexable solves this by separating content from presentation. It automatically extracts semantic content from your rendered UI and injects it as hidden, crawlable markup.

Core Principle: Same Content, Different Presentation

  • Humans → Rich UI (images, buttons, interactions)
  • Search Engines → Clean, fast, text-focused HTML
  • AI Crawlers → Structured Markdown

As long as the meaning and data remain the same, this approach is SEO-safe and future-proof.

Key Features

  • Automatic Extraction: Identifies and preserves meaningful content (headings, paragraphs, lists, code blocks)
  • UI Noise Removal: Strips away interactive elements (buttons, forms, navigation)
  • Image Handling: Converts images to text representations using alt attributes
  • Deterministic: No AI, no guessing pure, predictable content transformation
  • SEO Safe: Hidden with CSS only (display: none), no cloaking, no user-agent detection
  • Zero Configuration: Simple API with sensible defaults

Installation

npm install react-indexable

Quick Start

Basic Example

Let's take a simple math question page:

import { Indexable } from 'react-indexable';


export default function MathQuestion() {
   const [showAnswer, setShowAnswer] = useState(false);

    return (
     >gt;
       main id="content">gt;
         h1>gt;Math Questionh1>gt;
         img src="math.png" alt="Math illustration" />gt;
         p>gt;strong>gt;Question:strong>gt; What is 8 + 4?p>gt;
         button onClick={() =>gt; setShowAnswer(true)}>gt;Show Answerbutton>gt;
         {showAnswer &∓& p>gt;Answer: 12p>gt;}
       main>gt;

        Indexable source="#content" />gt;
     >gt;
   );

}

Math Question

Math illustration

Question: What is 8 + 4?

{showAnswer &&

Answer: 12

}
); }le source="#content" /> ); } ); }" tabindex="0" role="button">

What humans see:

What gets extracted for crawlers:

# Math Question


[Image: Math illustration]


**Question:** What is 8 + 4?

Answer: 1212

The button is automatically removed. Images are converted to text. Only semantic content remains.


Advanced Example

Component-Based Application

import { Indexable } from 'react-indexable';


function Hero() {
   return (
     section className="hero-section">gt;
       h1>gt;Product Nameh1>gt;
       p>gt;Revolutionary solution for modern web development.p>gt;
       button onClick={handleClick}>gt;Get Startedbutton>gt;
       img src="hero.jpg" alt="Product dashboard screenshot" />gt;
     section>gt;
   );

}


function Features() {
   return (
     section className="features-grid">gt;
       h2>gt;Key Featuresh2>gt;
       div className="feature-cards">gt;
         div className="card">gt;
           img src="icon1.svg" alt="Performance icon" />gt;
           h3>gt;Fast Performanceh3>gt;
           p>gt;Optimized for speedp>gt;
         div>gt;
         div className="card">gt;
           img src="icon2.svg" alt="Integration icon" />gt;
           h3>gt;Easy Integrationh3>gt;
           p>gt;Works with existing toolsp>gt;
         div>gt;
       div>gt;
     section>gt;
   );

}


export default function LandingPage() {
   return (
     >gt;
       main id="main-content">gt;
         Hero />gt;
         Features />gt;
       main>gt;

        Indexable source="#main-content" />gt;
     >gt;
   );

}

Product Name

Revolutionary solution for modern web development.

Product dashboard screenshot ); } function Features() { return (

Key Features

Performance icon

Fast Performance

Optimized for speed

Integration icon

Easy Integration

Works with existing tools

); } export default function LandingPage() { return ( <>
); }nt"> ); }uot;#main-content" /> ); }" tabindex="0" role="button">

Extracted Markdown (what crawlers see):

# Product Name

Revolutionary solution for modern web development.



[Image: Product dashboard screenshot]


## Key Features


[Image: Performance icon]


### Fast Performance

Optimized for speed



[Image: Integration icon]


### Easy Integration

Works with existing toolsls

All CSS classes, interactive elements, wrapper divs, and layout containers are removed. Only semantic content remains.


Why This Approach Works

No Content Mismatch

The same factual content is served to everyone. Only the presentation differs.

Faster Crawling

Search engines immediately see clean content without parsing UI elements.

Better AI Understanding

AI systems get structured Markdown without HTML noise.

SEO Safe


API Reference

/>

The primary component for content extraction.

Props

Prop Type Required Default Description
source string Yes - CSS selector of the content container to extract from
enabled boolean No true Toggle extraction on/off
onExtract (markdown: string) => voidoid No - Callback function that receives the extracted markdown

Example with All Props

Indexable
   source="#content"
   enabled={true}
   onExtract={(markdown) =>gt; {
     console.log('Extracted:', markdown);
   }}

/>gt;
{ console.log('Extracted:', markdown); }} />); }} />}} />" tabindex="0" role="button">

How It Works

Indexable follows a four-step process to transform UI-heavy content into clean, crawlable markup:

1. DOM Extraction

Clones the specified content container from the DOM without mutating the original. Your interactive UI remains untouched.

const cloned = sourceElement.cloneNode(true);

2. Semantic Filtering

Identifies and preserves only meaningful content:

Keeps:

Removes:

Converts:

Strips:

3. Markdown Conversion

Converts cleaned HTML to Markdown using Turndown with deterministic rules. No AI involved just pure transformation.

4. Hidden Injection

Injects the markdown into a hidden container that's visible to crawlers but not to users:

div data-indexable style="display:none" aria-hidden="true">gt;
   article>gt;
     pre>gt;-->pre>gt;
   article>gt;

div>gt;
div>" tabindex="0" role="button">

Image Handling

Images are automatically converted to text representations using their alt attributes, making visual content accessible to text-based crawlers:

Input:

img src="dashboard.png" alt="Analytics dashboard showing user metrics" />gt;
ser metrics" />" tabindex="0" role="button">

Output:

[Image: Analytics dashboard showing user metrics]

If no alt attribute is provided, it defaults to [Image: Image].


Verification

To verify that Indexable is working correctly:

  1. Right-click on your page and select View Page Source
  2. Press Ctrl+F (or Cmd+F on Mac) and search for data-indexable
  3. You should see the hidden container with your extracted content in clean Markdown format

Core Philosophy

Indexable operates under strict principles to ensure SEO safety and content integrity:

Indexable does not change what content is served. It changes how clearly that content can be understood.


Use Cases

Educational Content & Tutorialsials

Extract learning material while removing code editors, interactive widgets, and UI controls.

article id="tutorial">gt;
   h1>gt;Learn React Hooksh1>gt;
   p>gt;useState allows you to add state to functional components.p>gt;
   CodeEditor />gt; {/* Removed from extraction */}
   button>gt;Run Codebutton>gt; {/* Removed from extraction */}

article>gt;


Indexable source="#tutorial" />gt;

Learn React Hooks

useState allows you to add state to functional components.

{/* Removed from extraction */} {/* Removed from extraction */} t;#tutorial" />uot; />" tabindex="0" role="button">

Product Pages

Index product descriptions and specifications without "Add to Cart" buttons, image galleries, and review widgets.

Documentation Sites

Make API references and guides crawlable without navigation menus, search bars, and interactive examples.

Blog Posts & Articlescles

Extract article content while removing social sharing buttons, comment forms, and related post widgets.

Q&A Platformsorms

Preserve questions and answers while removing voting buttons, user avatars, and interaction controls.


Achievements

By using Indexable, you achieve:


TypeScript Support

Indexable is written in TypeScript and includes full type definitions.

import { Indexable, IndexableProps } from 'react-indexable';


const props: IndexableProps = {
   source: '#content',
   enabled: true,
   onExtract: (markdown: string) =>gt; {
     // Type-safe callback
   }

};

Browser Compatibility

Indexable works in all modern browsers that support:


Performance Considerations


Limitations


Contributing

Contributions are welcome! Please follow these guidelines:

Getting Started

  1. Fork the repository
  2. Clone your fork: git clone https://github.com/Amit00008/indexable.git
  3. Install dependencies: npm install
  4. Create a branch: git checkout -b feature/your-feature-name

Development Workflow

# Build the package
npm run buildd

Testing Your Changes

  1. Make changes to the source code in src/
  2. Build the package: npm run build
  3. Test in the playground:
    cd playground
    npm install
    npm run devev
  4. Open http://localhost:3000 to see your changes

Contribution Guidelines

Pull Request Process

  1. Ensure your code builds without errors
  2. Update documentation to reflect changes
  3. Write a clear PR description explaining the changes
  4. Reference any related issues

What We're Looking For

What We're Not Looking For


Support


Acknowledgments

Built with:


Roadmap

Future enhancements under consideration:


Conclusion

The web is no longer read by humans alone. Search engines and AI crawlers consume content differently, and designing only for UI is no longer enough.

By separating content from presentation, Indexable helps you:

This approach is simple, safe, and practical and it fits perfectly into modern web development.

Remember: Indexable is an infrastructure primitive, not an SEO hack. Use it to make your content more accessible to search engines and AI systems while maintaining the same content for all users.

About

Indexable helps search engines and AI understand your React Site better.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published