Tutorial: Extract News Headlines from BBC News
Learn how to scrape news headlines from BBC News website using Easy Scrape API - no coding experience required!
What You'll Learn
In this tutorial, you'll discover how to:
- Extract news headlines from a live website
- Use AI to generate scraping code automatically
- Test your scraping with Postman
- Handle dynamic content with Easy Scrape API
Prerequisites
Step 1: Understand What We're Scraping
We'll scrape news headlines from BBC News (https://www.bbc.com/news) because:
- ✅ It's a public website
- ✅ It has clear, structured content
- ✅ Headlines are easy to identify
- ✅ It's a real-world use case
Step 2: Get the Page HTML for AI Analysis
First, we need to get the HTML structure so AI can help us create the scraping code.
2.1 Set Up Basic Postman Request
- Open Postman and create a new request
- Set the method to
POST - Set the URL to:
https://easy-scrape-api.p.rapidapi.com/api/scrape
2.2 Add Headers
Go to the Headers tab and add:
| Key | Value |
|---|---|
X-RapidAPI-Key | YOUR_RAPIDAPI_KEY |
X-RapidAPI-Host | easy-scrape-api.p.rapidapi.com |
Content-Type | application/json |
2.3 Get Page HTML
In the Body tab (select raw → JSON), add:
{
"url": "https://www.bbc.com/news",
"outputFormat": "html"
}
Click Send to get the page HTML. Save the HTML from the response as a file (e.g., bbc_news.html) and then upload this file to your AI assistant for analysis.
Step 3: Generate Scraping Code with AI
Now we'll use AI to create the JavaScript code for extracting headlines.
3.1 AI Prompt Template
Upload the HTML file you saved in Step 2 to your favorite AI assistant (ChatGPT, Claude, Gemini, etc.) and use this prompt:
I want to scrape news headlines from BBC News website. I've uploaded an HTML file containing the page structure.
Please analyze the HTML file and write JavaScript code for the Easy Scrape API that:
1. Waits for the page to fully load
2. Extracts all news headlines from the page
3. Returns them as an array of objects with title and link
4. Uses the 'page' object (Puppeteer v24) and 'cheerio' for parsing
5. Includes console.log statements for debugging
6. Don't include inline comments or block comments
The AI might generate unnecessary code like imports const puppeteer = require('puppeteer') or navigation statements await page.goto('https://...').
Remove any code that includes:
- Import statements (puppeteer, cheerio imports)
page.goto()calls- Browser launch/close code
Only copy the code that starts AFTER any page.goto() statements. The Easy Scrape API handles navigation automatically.
Remember: Your script must always include a return statement with your data!
3.2 Example AI-Generated Code
The AI might generate something like this:
await page.waitForSelector('h3');
console.log('Page loaded, starting headline extraction...');
const html = await page.content();
const $ = cheerio.load(html);
const headlines = [];
$('h3').each((index, element) => {
const $element = $(element);
const title = $element.text().trim();
const link = $element.find('a').attr('href') || $element.closest('a').attr('href');
if (title && title.length > 10) { // Filter out short/empty titles
headlines.push({
title: title,
link: link ? (link.startsWith('http') ? link : `https://www.bbc.com${link}`) : null
});
}
});
console.log(`Found ${headlines.length} headlines`);
return headlines.slice(0, 10);
Step 4: Convert Code for Postman
Use our Online Code Parser Tool to convert the AI-generated code:
- Go to Online Code Parser Tool
- Select "Postman" from the dropdown
- Paste the AI-generated JavaScript code
- Click "Convert"
- Copy the converted string
Step 5: Test with Easy Scrape API
5.1 Update Postman Request
Back in Postman, update your request body:
{
"url": "https://www.bbc.com/news",
"script": "PASTE_YOUR_CONVERTED_CODE_HERE",
"outputFormat": "json"
}
5.2 Send the Request
Click Send and you should see a response like:
{
"message": "Script executed successfully",
"data": [
{
"title": "Breaking: Major news story headline here",
"link": "https://www.bbc.com/news/article-12345"
},
{
"title": "Another important news headline",
"link": "https://www.bbc.com/news/article-67890"
}
],
"executionTime": 2500,
"logs": [
"Page loaded, starting headline extraction...",
"Found 15 headlines"
]
}
Step 6: Improve Your Script
6.1 Refine with AI
If the results aren't perfect, go back to your AI assistant with:
The code worked but I'm getting some unwanted results. Here's what I got:
[PASTE YOUR ACTUAL RESULTS]
Can you improve the code to:
1. Filter out navigation links and ads
2. Only get main story headlines
3. Make sure all links are complete URLs
4. Limit to the top 5 most important stories
Please update the JavaScript code.
6.2 Test Different Selectors
You can also ask AI to try different approaches:
The previous selectors might not be catching all headlines. Can you create 2-3 different versions of the code that try different CSS selectors for BBC News headlines? I want to test which works best.
Step 7: Real-World Applications
Now that you can scrape BBC News headlines, you can:
7.1 Monitor Breaking News
Set up automated requests to check for new headlines every hour.
7.2 Create News Alerts
Compare current headlines with previous results to detect new stories.
7.3 Analyze News Trends
Collect headlines over time to analyze trending topics.
7.4 Content Curation
Use the headlines as sources for your own news aggregation.
Troubleshooting
Common Issues and Solutions
"No headlines found"
- Problem: Selectors might be wrong
- Solution: Ask AI to analyze the HTML again with different approaches
"Getting navigation links instead of headlines"
- Problem: Selector is too broad
- Solution: Ask AI to make selectors more specific to article headlines
"Links are incomplete"
- Problem: BBC uses relative URLs
- Solution: The example code already handles this with URL completion
"Too many results"
- Problem: Script is catching everything
- Solution: Ask AI to add better filtering conditions
Next Steps
- Try Different News Sites: Apply the same process to CNN, Reuters, or your local news
- Add More Data: Extract article summaries, publish dates, or categories
- Automate Collection: Set up scheduled runs in your automation platform
- Create Alerts: Build a system to notify you of breaking news
Pro Tips
- 🧠 Be Specific with AI: The more detailed your prompt, the better the code
- 🔍 Test Selectors: Use browser developer tools to verify CSS selectors
- 🚀 Start Simple: Get basic headlines working before adding complexity
- 📊 Check Logs: Use console.log output to debug issues
- 🔄 Iterate: Don't expect perfect results on the first try
Congratulations! You've successfully created your first web scraper without writing any code yourself. The combination of AI assistance and Easy Scrape API makes web scraping accessible to everyone.