Scraping Bed Bath and Beyond Stores Location

So, what is the easiest way to get CSV file of all the Bed Bath and Beyond store locations data in the USA?

Buy the Bed Bath and Beyond store data from our data store

There are over 700 Bed Bath and Beyond stores in USA and you buy the CSV file containing address, city, zip, latitude, longitude of each location in our data store for $50.

Bed Bath And Beyond store locations in the USA
Figure 1: Bed Bath and Beyond store locations. Source: Bed Bath and Beyond Store Locations dataset

If you are instead interested in scraping for locations on your own than continue reading rest of the article.

Scraping Bed Bath and Beyond stores locator webpage

We will keep things simple for now and try to web scrape Bed Bath and Beyond store locations for only one zipcode.

Python is great for web scraping and we will be using a library called Selenium to extract Bed Bath and Beyond store locator’s raw html source for zipcode 30301 (Atlanta, GA area).

  • Fetching raw html page from the Bed Bath and Beyond store locator page

  • We will automate entering of search query into the textbox and clicking enter using Selenium

### Using Selenium to extract Bed Bath and Beyond store locator's raw html source
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select

import time

from bs4 import BeautifulSoup
import numpy as np
import pandas as pd

test_url = 'https://www.bedbathandbeyond.com/store/selfservice/FindStore'

option = webdriver.ChromeOptions()
option.add_argument("--incognito")

chromedriver = r'chromedriver.exe'
browser = webdriver.Chrome(chromedriver, options=option)
browser.get(test_url)
text_area = browser.find_element_by_id('searchField')
text_area.send_keys("30301")
element = browser.find_element_by_xpath('//*[@id="seachStoresSubmit"]')
element.click()
html_source = browser.page_source
browser.close()

Using BeautifulSoup to extract Bed Bath and Beyond store details

Once we have the raw html source, we should use a Python library called BeautifulSoup for parsing the raw html files.

  • You should open the page in the chrome browser and click inspect.

HTML source code for Bed Bath and Beyond store locator webpage
Figure 2: Inspecting the source of Bed Bath and Beyond store locator webpage.

  • We will extract store names.

  • The data will still require some cleaning to extract out store names, address line 1,address line 2, city, state, zipcode and phone numbers but thats just basic Python string manipulation and we will leave that as an exercise to the reader.

# extracting Bed Bath and Beyond store names
soup=BeautifulSoup(html_source, "html.parser")

# extracting  store names

store_name_list_src = soup.find_all('span', {'class','SkuInStoreDetails_5XPw'})
store_name_list = []

for val in store_name_list_src:
    try:
        store_name_list.append(val.get_text())
    except:
        pass
store_name_list[:10]                                    
#Output
    ['Atlanta',
 'Buckhead Station',
 'Akers Mills',
 'Perimeter Square Shopping Center',
 'Marietta',
 'Gwinnett Market Fair',
 'Kennesaw',
 'Alpharetta',
 'McDonough',
 'Presidential Market Center']

The next step is extracting addresses. Referring back to the inspect in the chrome browser, we see that each address text is in fact of the class name mb2 Address_4JYJ so we just use the BeautifulSoup find_all method to extract that into a list.

# extracting Bed Bath and Beyond addresses

addresses_src = soup.find_all('address',{'class', 'mb2 Address_4JYJ'})

addresses_src
address_list = []

for val in addresses_src:
    address_list.append(val.get_text())
address_list[:10]
# Output

[' Bed Bath & Beyond 1235 Caroline Street Northeast  Atlanta, GA 30307 USA (404) 522-3210',
 ' Bed Bath & Beyond 1 Buckhead Loop  Atlanta, GA 30326 USA (404) 869-0457',
 ' Bed Bath & Beyond 2955 Cobb Parkway, Suite 110  Atlanta, GA 30339 USA (770) 916-9832',
 ' Bed Bath & Beyond 130 Perimeter Center West  Atlanta, GA 30346 USA (770) 673-0171',
 ' Bed Bath & Beyond 4475 Roswell Road  Marietta, GA 30062 USA (770) 971-2405',
 ' Bed Bath & Beyond 3675 Satellite Boulevard  Duluth, GA 30096 USA (770) 495-8255',
 ' Bed Bath & Beyond 840 Ernest W Barrett Parkway Northwest  Kennesaw, GA 30144 USA (770) 499-8863',
 ' Bed Bath & Beyond 6050 North Point Parkway  Alpharetta, GA 30022 USA (770) 475-3036',
 ' Bed Bath & Beyond 1898 Jonesboro Road  McDonough, GA 30253 USA (678) 583-2165',
 ' Bed Bath & Beyond 1905 Scenic Highway  Snellville, GA 30078 USA (770) 982-6263']

  • We will extract store services available at each location.
# extracting store services for each store

local_services_src = soup.find_all('ul', {'class', 'SkuInStoreDetails_34t7 left-align'})
local_services_list = []
for val in local_services_src:
    local_services_list.append(val.get_text())
    
local_services_list[:10]
# Output
    ['',
 'Health & Beauty ',
 '',
 'Fine Tabletop & GiftwareHealth & Beauty Specialty Foods',
 '',
 '',
 'Health & Beauty ',
 'Fine Tabletop & GiftwareHealth & Beauty Specialty Foods',
 '',
 '',
 '']
  • Lastly, lets extract store urls for each location.
# extracting store urls
store_url_list = []
store_url_list_src = soup.find_all('div', {'class','mt2'})
for val in store_url_list_src:
    try:
        store_url_list.append(val.find('a')['href'])
    except:
        pass
store_url_list[:10]
#Output
['https://stores.bedbathandbeyond.com/Atlanta-GA-30307-1014',
 '/store/pickup/store-1014',
 'https://stores.bedbathandbeyond.com/Atlanta-GA-30326-118',
 '/store/pickup/store-118',
 'https://stores.bedbathandbeyond.com/Atlanta-GA-30339-1094',
 '/store/pickup/store-1094',
 'https://stores.bedbathandbeyond.com/Atlanta-GA-30346-57',
 '/store/pickup/store-57',
 'https://stores.bedbathandbeyond.com/Marietta-GA-30062-280',
 '/store/pickup/store-280']

Converting into CSV file

You can take the lists above, and read it as a pandas DataFrame. Once you have the dataframe, you can convert to CSV, Excel or JSON easily without any issues.

Scaling up to a full crawler for extracting all Bed Bath and Beyond store locations in USA

  • Once you have the above scraper that can extract data for one zipcode/city, you will have to iterate through all the US zip codes.

  • it depends on how much coverage you want, but for a national chain like Bed Bath and Beyond you are looking at running the above function 100,000 times or more to ensure that no region is left out.

  • Once you scale up to make thousands of requests, the Bed Bath and Beyond’s servers will start blocking your IP address outright or you will be flagged and will start getting CAPTCHA.

  • To make it more likely to successfully fetch data for all USA, you will have to implement:

    • rotating proxy IP addresses preferably using residential proxies.
    • rotate user agents
    • Use an external CAPTCHA solving service like 2captcha or anticaptcha.com

After you follow all the steps above, you will realize that our pricing($50) for web scraped store locations data for all Bed Bath and Beyond stores location is one of the most competitive in the market.

Another advantage of buying our data includes geo-encoding (latitude and longitude) based on rooftop accuracy which makes it possible to plot locations on a Map like figure 1 or perform any advanced type of spatial analysis.