Skip to content

Instantly share code, notes, and snippets.

@1vth1nk3r
Last active June 10, 2023 10:30
Show Gist options
  • Select an option

  • Save 1vth1nk3r/d9b349d1b05df31488ff0301e81ef7aa to your computer and use it in GitHub Desktop.

Select an option

Save 1vth1nk3r/d9b349d1b05df31488ff0301e81ef7aa to your computer and use it in GitHub Desktop.
Pillow Palooza project sql questons
1. What is the most common room type in NYC Airbnb listings?
'entire home/apt' — 13266, but 'private room' pretty close
SELECT room_type, count(*) FROM "room_types"
GROUP BY room_type;
2. What is the average price of a listing by room type?
shared - 53.65, entire – 197.17, private – 81.67
SELECT r.room_type, round(AVG(p.price), 2)
FROM prices as p
LEFT JOIN room_types as r ON p.listing_id = r.listing_id
GROUP BY room_type;
3. Which borough has the highest average price per month?
Manhattan
SELECT p.borough, round(AVG(p.price_per_month)::numeric, 2) as avg_price
FROM prices as p
GROUP BY p.borough
ORDER BY avg_price DESC
4. How many listings of each room type are in each borough?
SELECT p.borough, r.room_type, count(*) AS listings
FROM prices as p
LEFT JOIN room_types as r ON p.listing_id = r.listing_id
GROUP BY p.borough, r.room_type
ORDER BY p.borough, r.room_type;
5. How many listings in each room type category have a price of over $500 per night?
entire – 395, private – 19, shared - 1
SELECT r.room_type, count(*) AS listings
FROM prices as p
LEFT JOIN room_types as r ON p.listing_id = r.listing_id
WHERE p.price > 500
GROUP BY r.room_type
ORDER BY r.room_type;
6. What is the distribution of listing prices by neighborhood?
SELECT p.borough,
round(MIN(p.price), 2) as min_price,
round(AVG(p.price), 2) as avg_price,
round(MAX(p.price), 2) as max_price
FROM prices as p
GROUP BY p.borough
7. What is the estimated amount of revenue generated by hosts in each borough?
SELECT p.borough,
SUM(p.price*r.booked_days_365) as revenue
FROM prices as p
LEFT JOIN reviews as r ON p.listing_id = r.listing_id
GROUP BY p.borough
ORDER BY p.borough;
8. What is the average price per month for listings in each neighborhood?
SELECT p.borough, p.neighbourhood,
round(AVG(p.price_per_month)::numeric, 2) as avg_price
FROM prices as p
GROUP BY p.borough, p.neighbourhood
ORDER BY p.borough, p.neighbourhood
SELECT p.neighbourhood, r.room_type,
round(AVG(p.price_per_month)::numeric, 2) as avg_price
FROM prices as p
LEFT JOIN room_types AS r ON p.listing_id = r.listing_id
GROUP BY p.neighbourhood, r.room_type
ORDER BY r.room_type, p.neighbourhood
9. How many listings have no reviews?
0
SELECT count(*)
FROM prices as p
LEFT JOIN reviews as r ON p.listing_id = r.listing_id
WHERE number_of_reviews IS NULL
10. How do the estimated book days correlate with the price of an Airbnb listing in New York City?
-0.080
SELECT round(corr(p.price, r.booked_days_365)::numeric,3)
FROM prices as p
LEFT JOIN reviews as r ON p.listing_id = r.listing_id
11. What is the average price per room type for listings that have at least 100 reviews
and are available more than 200 days a year?
SELECT rt.room_type, round(AVG(p.price)::numeric,3)
FROM prices as p
LEFT JOIN reviews as r ON p.listing_id = r.listing_id
LEFT JOIN room_types as rt ON p.listing_id = rt.listing_id
WHERE r.number_of_reviews >=100
AND r.availability_365 > 200
GROUP BY rt.room_type
ORDER BY rt.room_type
12. How many hosts have more than one listing, and what's the maximum number of listings by a single host name?
2930 / 215
SELECT host_name, num_listings
FROM
(SELECT host_name, count(listing_id) as num_listings
FROM reviews as r
GROUP BY host_name) AS temp
WHERE num_listings>1
ORDER BY num_listings DESC
13. Determine the top 5 hosts who have the highest price_per_month for their listings,
considering only hosts who have at least 10 listings.
SELECT host_name, max_price, num_listings
FROM
(SELECT r.host_name, MAX(p.price) as max_price,
count(r.listing_id) as num_listings
FROM reviews as r
LEFT JOIN prices as p ON r.listing_id = p.listing_id
GROUP BY r.host_name) AS hosts
WHERE num_listings > 10
ORDER BY max_price DESC
LIMIT 5
14. Find the neighborhood(s) that have the highest variance in listing prices.
SELECT borough, neighbourhood, var_price
FROM (
SELECT p.borough, p.neighbourhood,
round(VARIANCE(p.price), 2) as var_price
FROM prices as p
GROUP BY p.borough, p.neighbourhood ) as boroughs
WHERE var_price IS NOT NULL
ORDER BY var_price DESC
15. Calculate the average price_per_month for each neighborhood, taking into account only listings
where the host has a minimum_nights value that is higher than the average minimum_nights value across all listings.
SELECT p.borough, p.neighbourhood,
round(AVG(p.price_per_month)::numeric, 2) as avg_price
FROM
(SELECT *
FROM prices as p
LEFT JOIN reviews as r ON p.listing_id = r.listing_id
WHERE r.minimum_nights > (SELECT AVG(minimum_nights) FROM reviews)) as p
GROUP BY p.borough, p.neighbourhood
ORDER BY p.borough, p.neighbourhood
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment