General data engineering
- Google Dataproc
- Google Data Fusion
- Data pipeline design framework
- Statistics in data analysis
- Partitions on Apache Hive
- Overview of BI tools
- Order by vs. sort by vs. distribute by vs. cluster by
- MapReduce
- MapReduce components
- Managed table vs external table
- Introduction to Apache Pig
- Introduction to Apache Hive
- Hive window and analytic functions
- Data vault modelling
- DBT - the good solution to accelerate data transformation
- Buckets on Apache Hive
- Behind a Hive table
- Bloom filter
- Cap theorem
- Creating a fully local search engine on Memo
- Data analyst in retail trading
- Database design circular
- Database locking
- DuckDB demo and showcase
- Evolutionary database design
- Full-text search with PostgreSQL
- Google Data Fusion
- Google Dataproc
- Hadoop distributed file system (HDFS)
- Hive window and analytic functions
- How Discord stores messages - part 1: from MongoDB to Cassandra
- How I came up with our security standard
- Introduction to Apache Hive
- Introduction to Apache Pig
- Introduction to CRDT
- Local-first software
- Managed table vs external table
- MapReduce
- MapReduce components
- Multi-column index in DB
- Quick learning vector database
- Redis leaderboard
- Self-balanced BSTs - AVL trees
- SQL and how it relates to disk reads and writes
- SQL practices ORM vs plain SQL
- SQL sargable queries and their impact on database performance
- Statistics in data analysis
- Utilizing cached table for Binance Kline API data processing