Snips & Tips
Snips & Tips
  • Snips & Tips
  • 📊Data Science
    • Polars Dataframe Library
    • Loading large data
    • Pandas
      • Pandas Apply Function
    • Apache Spark
      • Custom Transformer
    • Data Visualizations
    • Jupyter Notebooks
      • Jupyter Notebook Structure
    • Probability
    • Statistics
      • Statistical Tests
      • Z - Test
      • Hypothesis Testing
    • SQL
      • SQL Tips
      • Creating new columns
  • ☘️Deep Learning
    • Backpropagation in Deep Learning
    • Pytorch Early Stopping
    • Optimizers
  • Pytorch Tensor Shapes
  • 🔖Machine Learning
    • Handling Imbalanced Dataset
    • Time Series Forecasting
      • Hierarchical Time Series Forecasting
      • Facebook Prophet
      • Misc
    • Handling high dimensionality data
      • Weight of evidence and Information value
    • Debugging ML Models
    • Feature Engineering
      • Time Series
      • Outlier Detection
      • Categorical Encoding
      • Feature Scaling
  • 🐲DSA
    • Arrays
  • 🖥️WEB DEV
    • Typescript
    • React State Management
    • Redux Boilerplate
    • Intercept a HTTP request or response
    • this keyword
    • Array Methods
    • Throttle Debounce
    • Media Queries
    • React Typeahead Search
  • Replace text with React Component
  • 💻Product Analytics
    • Product Sense
    • Customer Segmentation
  • 🖥️Terminal
    • Terminal Commands
    • Jupyter Notebook 2 HTML
  • 🪛Tools and Libraries
    • Web Based
    • Databases
  • 🚟Backend
    • Fast API CRUD
    • Scalable APIs
  • 💸Quant Finance
    • Misc
    • Factor Investing
  • 🎮Game Dev
    • Misc
  • 🛠️Architecture
    • Docker
    • AWS CDK
  • 🦠Artificial Intelligence
    • AI Engg
Powered by GitBook
On this page
  • Use inverse
  • Reduce duplicate rows cross-join
  • Interview Questions
  • Postgress SQL date diff

Was this helpful?

Edit on GitHub
  1. Data Science
  2. SQL

SQL Tips

Conditionals can be used inside SUM() with group by aggregations

having sum(name="XYZ") > 0

sum(case when x="something" then 1 else 0 end) as col_name

Use inverse

Instead of finding something using a complex query, use a subquery that does the opposite of the primary problem. Then use the outer query to find "not in subquery".

Reduce duplicate rows cross-join

WITH cte AS (
  SELECT BASKET_ID, SUB_COMMODITY_DESC
  FROM transaction_df t
  JOIN products_df p
    ON t.PRODUCT_ID = p.PRODUCT_ID
)

SELECT
  lt.SUB_COMMODITY_DESC AS p1,
  rt.SUB_COMMODITY_DESC AS p2,
  COUNT(*) AS cnt
FROM cte lt
JOIN cte rt
  ON lt.BASKET_ID = rt.BASKET_ID
  AND lt.SUB_COMMODITY_DESC < rt.SUB_COMMODITY_DESC -- This line
  AND lt.SUB_COMMODITY_DESC != rt.SUB_COMMODITY_DESC
GROUP BY p1, p2
ORDER BY cnt DESC

Interview Questions

Postgress SQL date diff

SELECT *
FROM tablexyz
WHERE
  DATEDIFF('2024-01-31', last_active_date) > 30;

-- using cast
SELECT *
FROM tablexyz
WHERE
  DATEDIFF(CAST('2024-01-31' AS date), CAST(last_active_date AS date)) > 30;

**Use row number/rank in ascending and decending order to find first and last row

with cte as (
    select 
    city
    , country
    , round(population/area) density
    , rank() over( order by population/area) as rn
    , rank() over( order by (population/area) desc) as rn2
    from cities_population where area <> 0
)
select city, country, density from cte where rn=1 or rn2=1
PreviousSQLNextCreating new columns

Last updated 7 months ago

Was this helpful?

📊
The Best Medium-Hard Data Analyst SQL Interview Questionsquip
How to Subtract Days From a Date in PostgreSQL
How to Find Difference Between Two Dates in PostgreSQL
Logo