SQL Tips

Conditionals can be used inside SUM() with group by aggregations

having sum(name="XYZ") > 0

sum(case when x="something" then 1 else 0 end) as col_name

Use inverse

Instead of finding something using a complex query, use a subquery that does the opposite of the primary problem. Then use the outer query to find "not in subquery".

Reduce duplicate rows cross-join

WITH cte AS (
  SELECT BASKET_ID, SUB_COMMODITY_DESC
  FROM transaction_df t
  JOIN products_df p
    ON t.PRODUCT_ID = p.PRODUCT_ID
)

SELECT
  lt.SUB_COMMODITY_DESC AS p1,
  rt.SUB_COMMODITY_DESC AS p2,
  COUNT(*) AS cnt
FROM cte lt
JOIN cte rt
  ON lt.BASKET_ID = rt.BASKET_ID
  AND lt.SUB_COMMODITY_DESC < rt.SUB_COMMODITY_DESC -- This line
  AND lt.SUB_COMMODITY_DESC != rt.SUB_COMMODITY_DESC
GROUP BY p1, p2
ORDER BY cnt DESC

Interview Questions

Postgress SQL date diff

SELECT *
FROM tablexyz
WHERE
  DATEDIFF('2024-01-31', last_active_date) > 30;

-- using cast
SELECT *
FROM tablexyz
WHERE
  DATEDIFF(CAST('2024-01-31' AS date), CAST(last_active_date AS date)) > 30;

**Use row number/rank in ascending and decending order to find first and last row

with cte as (
    select 
    city
    , country
    , round(population/area) density
    , rank() over( order by population/area) as rn
    , rank() over( order by (population/area) desc) as rn2
    from cities_population where area <> 0
)
select city, country, density from cte where rn=1 or rn2=1

Last updated