Created
February 21, 2026 22:53
-
-
Save lassebenni/c546fd94166922bc39155113abf9afe7 to your computer and use it in GitHub Desktop.
Visual for: Parquet columnar storage vs CSV row storage
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <!DOCTYPE html> | |
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>Parquet Columnar Storage: Row vs Column Layout</title> | |
| <style> | |
| body { | |
| margin: 0; | |
| padding: 20px; | |
| background: #1a1a2e; | |
| display: flex; | |
| justify-content: center; | |
| align-items: center; | |
| min-height: 100vh; | |
| font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif; | |
| } | |
| svg { | |
| max-width: 700px; | |
| width: 100%; | |
| height: auto; | |
| } | |
| </style> | |
| </head> | |
| <body> | |
| <svg viewBox="0 0 700 420" xmlns="http://www.w3.org/2000/svg"> | |
| <!-- Background --> | |
| <rect width="700" height="420" rx="12" fill="#1a1a2e"/> | |
| <!-- Title --> | |
| <text x="350" y="32" text-anchor="middle" fill="#e0e0e0" font-size="18" font-weight="bold" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif">Row Storage (CSV) vs Column Storage (Parquet)</text> | |
| <!-- ===== LEFT SIDE: ROW STORAGE (CSV) ===== --> | |
| <text x="170" y="65" text-anchor="middle" fill="#e94560" font-size="14" font-weight="bold" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif">CSV: Row by Row</text> | |
| <!-- Header row --> | |
| <rect x="30" y="80" width="90" height="28" rx="4" fill="#e94560" opacity="0.6"/> | |
| <rect x="125" y="80" width="60" height="28" rx="4" fill="#f0a500" opacity="0.6"/> | |
| <rect x="190" y="80" width="70" height="28" rx="4" fill="#4ecca3" opacity="0.6"/> | |
| <text x="75" y="99" text-anchor="middle" fill="#fff" font-size="11" font-weight="bold" font-family="monospace">name</text> | |
| <text x="155" y="99" text-anchor="middle" fill="#fff" font-size="11" font-weight="bold" font-family="monospace">age</text> | |
| <text x="225" y="99" text-anchor="middle" fill="#fff" font-size="11" font-weight="bold" font-family="monospace">city</text> | |
| <!-- Row 1 --> | |
| <rect x="30" y="114" width="90" height="28" rx="4" fill="#e94560" opacity="0.2"/> | |
| <rect x="125" y="114" width="60" height="28" rx="4" fill="#f0a500" opacity="0.2"/> | |
| <rect x="190" y="114" width="70" height="28" rx="4" fill="#4ecca3" opacity="0.2"/> | |
| <text x="75" y="133" text-anchor="middle" fill="#e0e0e0" font-size="11" font-family="monospace">Alice</text> | |
| <text x="155" y="133" text-anchor="middle" fill="#e0e0e0" font-size="11" font-family="monospace">25</text> | |
| <text x="225" y="133" text-anchor="middle" fill="#e0e0e0" font-size="11" font-family="monospace">NYC</text> | |
| <!-- Row 2 --> | |
| <rect x="30" y="148" width="90" height="28" rx="4" fill="#e94560" opacity="0.2"/> | |
| <rect x="125" y="148" width="60" height="28" rx="4" fill="#f0a500" opacity="0.2"/> | |
| <rect x="190" y="148" width="70" height="28" rx="4" fill="#4ecca3" opacity="0.2"/> | |
| <text x="75" y="167" text-anchor="middle" fill="#e0e0e0" font-size="11" font-family="monospace">Bob</text> | |
| <text x="155" y="167" text-anchor="middle" fill="#e0e0e0" font-size="11" font-family="monospace">30</text> | |
| <text x="225" y="167" text-anchor="middle" fill="#e0e0e0" font-size="11" font-family="monospace">LA</text> | |
| <!-- Row 3 --> | |
| <rect x="30" y="182" width="90" height="28" rx="4" fill="#e94560" opacity="0.2"/> | |
| <rect x="125" y="182" width="60" height="28" rx="4" fill="#f0a500" opacity="0.2"/> | |
| <rect x="190" y="182" width="70" height="28" rx="4" fill="#4ecca3" opacity="0.2"/> | |
| <text x="75" y="201" text-anchor="middle" fill="#e0e0e0" font-size="11" font-family="monospace">Carol</text> | |
| <text x="155" y="201" text-anchor="middle" fill="#e0e0e0" font-size="11" font-family="monospace">28</text> | |
| <text x="225" y="201" text-anchor="middle" fill="#e0e0e0" font-size="11" font-family="monospace">CHI</text> | |
| <!-- Row bracket: reads ALL data --> | |
| <rect x="28" y="112" width="234" height="100" rx="6" fill="none" stroke="#e94560" stroke-width="2" stroke-dasharray="6,3"/> | |
| <text x="170" y="230" text-anchor="middle" fill="#e94560" font-size="11" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif">← Must read ALL rows + columns</text> | |
| <!-- Query label --> | |
| <rect x="40" y="250" width="240" height="34" rx="6" fill="#16213e" stroke="#8a8a9a" stroke-width="1"/> | |
| <text x="160" y="272" text-anchor="middle" fill="#8a8a9a" font-size="12" font-family="monospace">SELECT AVG(age) FROM data</text> | |
| <!-- Result --> | |
| <text x="160" y="305" text-anchor="middle" fill="#e94560" font-size="13" font-weight="bold" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif">Reads 9 cells to get 3 values</text> | |
| <!-- ===== DIVIDER ===== --> | |
| <line x1="350" y1="55" x2="350" y2="320" stroke="#3a3a5c" stroke-width="1" stroke-dasharray="4,4"/> | |
| <text x="350" y="340" text-anchor="middle" fill="#6c6c80" font-size="12" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif">vs</text> | |
| <!-- ===== RIGHT SIDE: COLUMN STORAGE (PARQUET) ===== --> | |
| <text x="530" y="65" text-anchor="middle" fill="#4ecca3" font-size="14" font-weight="bold" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif">Parquet: Column by Column</text> | |
| <!-- Name column --> | |
| <rect x="390" y="80" width="70" height="130" rx="4" fill="#e94560" opacity="0.15"/> | |
| <text x="425" y="99" text-anchor="middle" fill="#e94560" font-size="11" font-weight="bold" font-family="monospace">name</text> | |
| <text x="425" y="125" text-anchor="middle" fill="#8a8a9a" font-size="11" font-family="monospace">Alice</text> | |
| <text x="425" y="150" text-anchor="middle" fill="#8a8a9a" font-size="11" font-family="monospace">Bob</text> | |
| <text x="425" y="175" text-anchor="middle" fill="#8a8a9a" font-size="11" font-family="monospace">Carol</text> | |
| <!-- Age column (highlighted) --> | |
| <rect x="470" y="80" width="60" height="130" rx="4" fill="#f0a500" opacity="0.3"/> | |
| <rect x="470" y="80" width="60" height="130" rx="4" fill="none" stroke="#4ecca3" stroke-width="2.5"/> | |
| <text x="500" y="99" text-anchor="middle" fill="#f0a500" font-size="11" font-weight="bold" font-family="monospace">age</text> | |
| <text x="500" y="125" text-anchor="middle" fill="#e0e0e0" font-size="11" font-weight="bold" font-family="monospace">25</text> | |
| <text x="500" y="150" text-anchor="middle" fill="#e0e0e0" font-size="11" font-weight="bold" font-family="monospace">30</text> | |
| <text x="500" y="175" text-anchor="middle" fill="#e0e0e0" font-size="11" font-weight="bold" font-family="monospace">28</text> | |
| <!-- City column --> | |
| <rect x="540" y="80" width="70" height="130" rx="4" fill="#4ecca3" opacity="0.15"/> | |
| <text x="575" y="99" text-anchor="middle" fill="#4ecca3" font-size="11" font-weight="bold" font-family="monospace">city</text> | |
| <text x="575" y="125" text-anchor="middle" fill="#8a8a9a" font-size="11" font-family="monospace">NYC</text> | |
| <text x="575" y="150" text-anchor="middle" fill="#8a8a9a" font-size="11" font-family="monospace">LA</text> | |
| <text x="575" y="175" text-anchor="middle" fill="#8a8a9a" font-size="11" font-family="monospace">CHI</text> | |
| <!-- Arrow pointing to age column only --> | |
| <text x="530" y="230" text-anchor="middle" fill="#4ecca3" font-size="11" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif">← Only reads the age column</text> | |
| <!-- Query label --> | |
| <rect x="400" y="250" width="240" height="34" rx="6" fill="#16213e" stroke="#8a8a9a" stroke-width="1"/> | |
| <text x="520" y="272" text-anchor="middle" fill="#8a8a9a" font-size="12" font-family="monospace">SELECT AVG(age) FROM data</text> | |
| <!-- Result --> | |
| <text x="520" y="305" text-anchor="middle" fill="#4ecca3" font-size="13" font-weight="bold" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif">Reads 3 cells to get 3 values</text> | |
| <!-- ===== BOTTOM: Key takeaway ===== --> | |
| <rect x="100" y="360" width="500" height="44" rx="8" fill="#16213e" stroke="#4ecca3" stroke-width="1"/> | |
| <text x="350" y="380" text-anchor="middle" fill="#e0e0e0" font-size="12" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif">With 50 columns and 1M rows, Parquet reads <tspan fill="#4ecca3" font-weight="bold">1 column</tspan>.</text> | |
| <text x="350" y="396" text-anchor="middle" fill="#e0e0e0" font-size="12" font-family="-apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif">CSV reads <tspan fill="#e94560" font-weight="bold">all 50</tspan>. That is a 50x difference in I/O.</text> | |
| </svg> | |
| </body> | |
| </html> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment