Poorly written SQL queries can result in slow performance, high resource costs, locking and blocking issues, and unhappy users. The following are common practices that can be used to write efficient queries.
1. Use Indexes WiselyIndexes let the database quickly look up rows instead of scanning the entire table. For Example:
Creating an index on customer_id if there are frequent queries on this column like the following query.
SELECT * FROM orders WHERE customer_id = 123;
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
The above query will run much faster if customer_id is indexed.
Best Practices:
Using SELECT * impact query performance with large tables or joins. The database engine retrieves every column, even the ones you don’t need which increases memory usage, slows down data transfer, and makes the execution plan more complex. Example:
Avoid this:Use this instead:SELECT * FROM products;
Why This Helps:SELECT product_id, product_name, price FROM products;
Fetching more rows than needed is a common issue. Even if you only use 10 rows in your app, the query might retrieve thousands, slowing things down. Use the WHERE clause to filter data precisely and LIMIT to restrict the number of rows returned.
Example:
Benefits:SELECT name FROM customers
WHERE country = 'USA'
ORDER BY signup_date DESC
LIMIT 50;
The WHERE clause is one of the most important parts of an SQL query because it filters rows based on conditions. However, how you write it can significantly impact performance. A common mistake is using functions or operations directly on column values in the WHERE clause — this can prevent the database from using indexes, which slows down query execution.
Poor:SELECT * FROM employees WHERE YEAR(joining_date) = 2022;
Why this is bad: The YEAR() function is applied to every value in the joining_date column. This disables the use of indexes, forcing a full table scan.
Optimized:Optimization Tips:SELECT * FROM employees
WHERE joining_date >= '2022-01-01' AND joining_date < '2023-01-01';
Using SQL functions (like UPPER(), LOWER(), DATE()) on indexed columns can prevent the database from using indexes, leading to slower queries.
Bad (pseudocode):Good (pseudocode):SELECT * FROM users WHERE UPPER(email) = 'JOHN@GMAIL.COM';
Why This Matters:SELECT * FROM users WHERE email = 'john@gmail.com';
Always join only the tables you need and filter data before joining whenever possible. Use INNER JOIN instead of OUTER JOIN when you don't need unmatched records.
Example:
Why This Matters:SELECT u.name, o.amount
FROM users u
JOIN orders o ON u.user_id = o.user_id
WHERE o.amount > 100;
N+1 happens when your app makes one query to get a list, then runs additional queries in a loop to get related data. Always aim to fetch related data in one query using JOINs.
Bad (pseudocode):Good pseudocode:SELECT * FROM users;
-- For each user: SELECT * FROM orders WHERE user_id = ?
Why This Matters:SELECT u.user_id, u.name, o.order_id, o.amount
FROM users u
JOIN orders o ON u.user_id = o.user_id;
When checking existence, EXISTS can be more efficient than IN, especially if the subquery returns a large dataset.
Bad (pseudocode):Good (pseudocode):SELECT name FROM customers
WHERE customer_id IN (SELECT customer_id FROM orders);
Why This Matters:SELECT name FROM customers
WHERE EXISTS (
SELECT 1 FROM orders WHERE orders.customer_id = customers.customer_id
);
Using % at the beginning of a LIKE pattern disables index use and leads to full table scans.
Bad (pseudocode):Good (pseudocode):SELECT * FROM users WHERE name LIKE '%john';
Why This Matters:SELECT * FROM users WHERE name LIKE 'john%';
While normalization keeps data clean, excessive joins can slow down read-heavy queries. Denormalization (storing some redundant data) can help in scenarios where performance is more critical than strict data structure.
Example:
Every major DBMS has a way to show the execution plan (like EXPLAIN in MySQL/PostgreSQL). It shows how the SQL engine processes your query.
Example:
Why This Matters:EXPLAIN SELECT * FROM orders WHERE user_id = 42;
UNION removes duplicates, which adds sorting overhead. If you don’t need duplicates removed, UNION ALL is faster.
Bad (pseudocode):Good (pseudocode):SELECT col FROM table1
UNION
SELECT col FROM table2;
Why This Matters:SELECT col FROM table1
UNION ALL
SELECT col FROM table2;
Don't run a query inside a loop in your app if you can write one efficient query that retrieves all needed data.
Bad (pseudocode):Good (pseudocode):for id in ids:
cursor.execute("SELECT name FROM users WHERE id = ?", (id,))
Why This Matters:SELECT id, name FROM users WHERE id IN (1, 2, 3, 4);
Partitioning helps by breaking a large table into smaller, more manageable chunks. Queries on partitions are faster as they scan only relevant data.
Example:
Sorting and grouping can be expensive operations. Always limit the number of rows being sorted or grouped and use indexes that match the ORDER BY columns if possible.
Bad (pseudocode):Good (pseudocode):SELECT * FROM orders ORDER BY created_at;
Why This Matters:SELECT order_id, amount FROM orders WHERE created_at >= '2023-01-01' ORDER BY created_at;
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4