Code that gets online headlines news
Here is an example of how you can use Python and the requests
and BeautifulSoup
libraries to scrape online headlines news from a website:
import requests
from bs4 import BeautifulSoup
# URL of the website you want to scrape
url = "https://www.example.com/news"
# Send a GET request to the URL
response = requests.get(url)
# Parse the HTML content of the page using BeautifulSoup
soup = BeautifulSoup(response.content, 'html.parser')
# Find all the headline news articles on the page
headlines = soup.find_all('h2', class_='headline')
# Loop through each headline and print the title and link
for headline in headlines:
title = headline.text.strip()
link = headline.find('a')['href']
print(f"{title} - {link}")
This code sends a GET request to the specified URL, parses the HTML content of the page using BeautifulSoup, and then finds all the headline news articles on the page using the find_all
method. It then loops through each headline and prints the title and link.
You can customize this code to scrape specific websites or news sources by modifying the url
variable and the CSS selectors used in the find_all
method.
Here are a few more examples of how you can use this code to scrape online headlines news:
Example 2: Scrape headlines from CNN
url = "https://www.cnn.com/world"
headlines = soup.find_all('h3', class_='cd__headline-text')
Example 3: Scrape headlines from BBC News
url = "https://www.bbc.com/news"
headlines = soup.find_all('h2', class_='gs-c-promo-heading__title gel-paragon-bold')
Example 4: Scrape headlines from Google News
url = "https://news.google.com/topstories"
headlines = soup.find_all('h3', class_='ipQ0Sh')
Note that web scraping is subject to the terms of service of the website being scraped, and some websites may prohibit web scraping or require permission to do so. Always make sure to check the website's terms of service before scraping their content.
Also, be aware that web scraping can be slow and may not work well with websites that use a lot of JavaScript or have complex page layouts. In these cases, you may need to use more advanced techniques such as using a headless browser or a dedicated web scraping tool.