Why Use Bright Data With LlamaIndex?
The Bright Data tool provides the following capabilities:Web Scraping
Web Scraping
- scrape_as_markdown
 Scrape a webpage and convert the content to Markdown format. This tool can bypass CAPTCHA and bot detection.
Visual Capture
Visual Capture
- get_screenshot
 Take a screenshot of a webpage and save it to a file.
Search Engine Access
Search Engine Access
- search_engine
 Search Google, Bing, or Yandex and get structured search results as JSON or Markdown. Supports advanced parameters for more specific searches.
Structured Web Data Extraction
Structured Web Data Extraction
- web_data_feed
 Retrieve structured data from various platforms including LinkedIn, Amazon, Instagram, Facebook, X (Twitter), Zillow, and more.
Advanced Configuration
Advanced Configuration
The Bright Data tool offers various configuration options for specialized use cases:
Search Engine Parameters
Thesearch_engine function supports advanced parameters like:- Language targeting (languageparameter)
- Country-specific search (country_codeparameter)
- Different search types (images, shopping, news, etc.)
- Pagination controls
- Mobile device emulation
- Geolocation targeting
- Hotel search parameters
Supported Web Data Sources
Theweb_data_feed function supports retrieving structured data from:- LinkedIn (profiles and companies)
- Amazon (products and reviews)
- Instagram (profiles, posts, reels, comments)
- Facebook (posts, marketplace listings, company reviews)
- X/Twitter (posts)
- Zillow (property listings)
- Booking.com (hotel listings)
- YouTube (videos)
- ZoomInfo (company profiles)
How to Integrate Bright Data With LlamaIndex?
1
Obtain Your Bright Data API Key
- Log in to your Bright Data dashboard.
- Go to Account Settings.
- Generate an API key if you haven’t already done so.
2
Installation
Install the required packages:
3
Usage
Here’s an example of how to use the BrightDataToolSpec with LlamaIndex: