System Design
Database Indexes
Index — ma’lumotlarni tez topish uchun mo’ljallangan ma’lumotlar strukturasi.
Muammo: Slow queries
1 million users, name’ga search:
SELECT * FROM users WHERE name = 'Jamshid';
Index yo’q: Full table scan — 1M qatorni tekshirish
Vaqt: 1-5 sekund
Index bilan: B-tree’da qidirish
Vaqt: 1-10 millisekund
500x tezroq!
Index nima?
Kitobning oxirida index (so’zlar ro’yxati):
Scaling → 15, 42, 87-bet
Sharding → 93-bet
ACID → 12-bet
Har bir so’zni izlash o’rniga, index’dan sahifa raqamini topasz.
Database index xuddi shunday:
Index:
"Jamshid" → Row 42, 156, 892
"Ali" → Row 13, 89
B-Tree Index (Eng ko’p qo’llaniladigan)
[M]
/ \
[D,H] [Q,V]
/ | \ / | \
[A] [E] [J] [N] [R] [W]
↓ ↓ ↓ ↓ ↓ ↓
Data Data Data Data Data Data
Complexity: O(log N)
1M rows: ~20 disk reads
1B rows: ~30 disk reads
Misol
CREATE INDEX idx_users_name ON users(name);
SELECT * FROM users WHERE name = 'Jamshid';
Qidiruv:
- Root node: “Jamshid” > “M”? Yo’q, chapga
- [D,H]: “Jamshid” > “H”? Ha, o’ngga
- [J]: Topildi! → Row 42, 156, 892
20-30 disk read o’rniga 3-4 disk read
Index turlari
1. Primary Key Index (Automatic)
CREATE TABLE users (
id SERIAL PRIMARY KEY, -- Avtomatik index yaratiladi
name VARCHAR(100)
);
Unique va sorted — juda tez.
2. Unique Index
CREATE UNIQUE INDEX idx_users_email ON users(email);
Duplicate qiymatlar yo’q bo’lishini ta’minlaydi.
3. Composite Index (Multi-column)
CREATE INDEX idx_users_last_first ON users(last_name, first_name);
Ishlaydi:
WHERE last_name = 'Smith'
WHERE last_name = 'Smith' AND first_name = 'John'
Ishlamaydi:
WHERE first_name = 'John' (birinchi column emas)
Qoida: Left-most prefix.
4. Partial Index
CREATE INDEX idx_active_users ON users(name) WHERE status = 'active';
Faqat active userlar index’langan → kam joy, tezroq.
5. Full-Text Index
CREATE INDEX idx_posts_content ON posts USING GIN(to_tsvector('english', content));
SELECT * FROM posts WHERE to_tsvector('english', content) @@ to_tsquery('scaling');
Qachon: Search functionality.
Index qachon ishlatiladi?
Index kerak:
1. WHERE clause:
SELECT * FROM users WHERE email = 'user@example.com';
-- Index on email
2. JOIN:
SELECT * FROM orders o JOIN users u ON o.user_id = u.id;
-- Index on orders.user_id
3. ORDER BY:
SELECT * FROM posts ORDER BY created_at DESC LIMIT 10;
-- Index on created_at
4. GROUP BY:
SELECT category, COUNT(*) FROM products GROUP BY category;
-- Index on category
Index kerak emas:
1. Kichik tablelar (<1000 rows)
Full scan tezroq.
2. High cardinality yo’q:
-- gender: male/female (2 qiymat)
CREATE INDEX idx_users_gender ON users(gender);
Foydasiz — 50% rows male, 50% female.
3. Rarely used queries:
Index storage va write overhead.
4. Juda ko’p yoziladigan tablelar:
Har bir INSERT/UPDATE index’ni yangilaydi.
Index overhead
Storage
-- Table: 1M rows, 1GB
-- Index: ~200MB extra per index
Har bir index storage talab qiladi.
Write performance
INSERT INTO users VALUES (...);
Index yo’q: 1ms
3 ta index: 4ms (har bir index’ni yangilash kerak)
Trade-off: Read speed uchun write speed qurbon qilish.
EXPLAIN — Query analyze
EXPLAIN ANALYZE
SELECT * FROM users WHERE name = 'Jamshid';
Index yo’q:
Seq Scan on users (cost=0.00..18334.00 rows=100000)
Filter: (name = 'Jamshid')
Execution Time: 2500.123 ms
Index bilan:
Index Scan using idx_users_name on users (cost=0.42..8.45 rows=3)
Index Cond: (name = 'Jamshid')
Execution Time: 1.234 ms
2000x tezroq!
Index best practices
1. Index WHERE columns
-- Query
SELECT * FROM orders WHERE user_id = 123 AND status = 'completed';
-- Index
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
2. Cover index (Include columns)
-- Query (SELECT specific columns)
SELECT user_id, total, created_at FROM orders WHERE status = 'completed';
-- Covering index
CREATE INDEX idx_orders_status_cover ON orders(status) INCLUDE (user_id, total, created_at);
Index o’zida barcha kerakli columnlar → table access kerak emas → juda tez
3. Analyze va monitor
-- PostgreSQL
ANALYZE users;
-- Query stats
SELECT * FROM pg_stat_user_indexes;
Qaysi indexlar ishlatilayotgani, qaysilari yo’qligi.
4. Drop unused indexes
-- Unused index
DROP INDEX idx_users_phone;
Har 6 oyda unused indexlarni o’chirish.
5. Partial index for specific queries
-- Faqat premium userlar ko'p query qilinadi
CREATE INDEX idx_premium_users ON users(id) WHERE plan = 'premium';
Kam joy, tezroq.
Real-world misollar
-- Photos by user
CREATE INDEX idx_photos_user_created ON photos(user_id, created_at DESC);
-- Query
SELECT * FROM photos WHERE user_id = 123 ORDER BY created_at DESC LIMIT 20;
-- Index'dan to'g'ridan-to'g'ri
-- Timeline
CREATE INDEX idx_tweets_user_time ON tweets(user_id, created_at DESC);
-- Trending hashtags
CREATE INDEX idx_tweets_hashtag_time ON tweets USING GIN(hashtags);
E-commerce
-- Product search
CREATE INDEX idx_products_category_price ON products(category, price);
-- Full-text search
CREATE INDEX idx_products_search ON products USING GIN(to_tsvector('english', name || ' ' || description));
Sharding va indexes
Sharded database’da index har bir shard’da:
Shard 1: users 0-1M
- Index on email
- Index on name
Shard 2: users 1M-2M
- Index on email
- Index on name
Global index yo’q — har bir shard’da alohida.
Xulosa
Index:
- Read tezligini oshiradi (100-1000x)
- Storage va write overhead
- B-tree eng ko’p qo’llaniladi
Qachon yaratish:
- WHERE, JOIN, ORDER BY columnlar
- High cardinality (ko’p unique qiymatlar)
- Ko’p query qilinadigan
Best practices:
- EXPLAIN bilan analyze qiling
- Composite index (multi-column)
- Covering index (INCLUDE)
- Unused indexlarni o’chiring
Keyingi dars: Database Replication - high availability.