Paper Review: MonetDB/X100 - Hyper-Pipelining Query Execution

Link https://15721.courses.cs.cmu.edu/spring2024/papers/04-execution1/boncz-cidr2005.pdf Notes This is a 2005 paper from the CMU Advanced Database Systems syllabus. The authors are from CWI (same research lab behind DuckDB). The purpose of the paper is to demonstrate how DBMS of the time were poorly suited for modern super-scalar CPUs, and gives an overview of the design and implementation of new execution engine for MonetDB that has better mechanical sympathy. What are super-scalar CPUs? They are the processor we’re most familiar with today (Intel, ARM) Designed with multiple pipelines, which are able to reorder the order of code execution Having more pipelines can be better than having a faster clock speed in many cases Data dependencies (e.
Read more →

Building With Rust

I’ve been developing software professionally for over 10 years now. In that time, I’ve mostly focused on the backend (with the occasional daliance with WebGL, JS, TypeScript and that entire mess of an ecosystem). My go-to languages in that time have been Java and Golang. Golang is simple, easy to get started with, and is incredibly convenient to deploy. I’ve built many projects with Golang that are never meant to grow huge, but to do one thing really well.
Read more →

What’s up, Meltdown?

Everyone seems to be going crazy about Meltdown/Spectre for the past week. I just finished reading the Meltdown paper, and doing so made the existential threat clearer, though I’m still cautious to believe that the sky is falling. Very credible folks from the security community (tptacek et al.) seem to think that this is v bad, so I’m inclined to believe that I’m not seeing the full picture here. I first read the Spectre PoC code (published in Appendix A of that paper and available as in Github gist form here).
Read more →

Notes on Google PowerDrill

Links Hall et al.: Processing a Trillion Cells per Mouse Click Hall lecture video Wired article (typical Wired garbage, but still contains a few details not found in the paper) Notes Formatted in a question / answer style Introduction & Background What is PowerDrill (PD)? A web-based analysis tool built by Google AdWords team The columnar storage backend and execution engine is called “PD Serving”, and is the focus of this paper What types of analysis can you do in PD?
Read more →

An Interesting Retrospective on The Fall of Sun

Java has been one of the primary languages I’ve used as an engineer since about 2009 or so, which coincidentally coincided with the sale of Sun to Oracle. I wasn’t very aware of the history of the industry until I started poking around a few years ago, and seeing as Sun more-or-less predated me by a few years, I wanted to understand what they were like in their hayday, and what made them die.
Read more →

A Collection of JDK Hacks That Are Broken In Jigsaw

The Hack Injecting new classes into packages you don’t actually own. Why We Do It: Accessing package protected methods and classes. Post-Jigsaw Solution: Fix the damn issue in the upstream code. If it’s a very slow-moving project then fork it and publish the artifacts to Jitpack or your favorite corporate Maven repo. If it’s a fast-moving project that has monthly releases then just send a patch to the owner and see if they accept it.
Read more →

Make Your Own Tools

Give me lever long enough and a fulcrum sturdy enough and I will move the world. — Archimedes on mechanical advantage Tools give you leverage, the upfront investment saves you time later on. Whenever you have a large project, make sure you either find OR MAKE the right tools for the job. Even this site is developed using a hand-rolled Makefile. I used to have to run some tedious commands to perform a watch, deploy and rebuild parts of my site.
Read more →