Skip to content

Instantly share code, notes, and snippets.

@prettyirrelevant
Last active March 13, 2026 10:03
Show Gist options
  • Select an option

  • Save prettyirrelevant/72e37c4165230523b34c86c128307fe8 to your computer and use it in GitHub Desktop.

Select an option

Save prettyirrelevant/72e37c4165230523b34c86c128307fe8 to your computer and use it in GitHub Desktop.
code review: EssTeeOoh/csv-to-sheets

hey, thanks for submitting the csv-to-sheets project. i've reviewed the codebase and wanted to share some feedback.

the project has a number of bugs and issues that would cause problems in real usage. there are several places where errors are silently swallowed, meaning the user would see broken or incomplete results with no indication that something went wrong. there are also code paths that would crash the app outright on certain inputs, and some logic that exists in the code but doesn't actually do anything. beyond the bugs, there are no tests anywhere in the project, which makes it hard to trust that things work correctly or to make changes safely.

i've put together a detailed technical breakdown here: https://gist.github.com/prettyirrelevant/72e37c4165230523b34c86c128307fe8

i'd recommend going through that and focusing on error handling, thinking through what happens when things fail, not just when they succeed. that's a critical skill and it's something we look for.

utils.py

1. silent latin-1 fallback

app/utils.py#L18-L21

try:
    content = file_bytes.decode("utf-8")
except UnicodeDecodeError:
    content = file_bytes.decode("latin-1")

latin-1 accepts every byte (0x00-0xFF), never throws. a renamed zip uploaded as .csv decodes into garbage and gets pushed to sheets. no log entry on fallback. prefer cp1252 if the goal is supporting excel exports, it actually rejects invalid bytes.

also, csv.reader returns plain lists. csv.DictReader gives rows keyed by header, cleaner for column handling.

2. dead ragged row check

app/utils.py#L47-L52

for i, row in enumerate(rows[1:], start=2):
    if len(row) != num_columns:
        pass

iterates every row, checks column count, does nothing. remove it or surface the mismatches.

3. memory will crash render on large files

app/utils.py#L18-L24

file_bytes + decoded content + materialized rows all in memory at once. ~3x file size. 200MB csv needs 400-600MB, 512MB render instance will OOM.

fix: stream with a generator.

def parse_csv(file_bytes: bytes):
    try:
        text_stream = io.TextIOWrapper(io.BytesIO(file_bytes), encoding="utf-8")
    except UnicodeDecodeError:
        text_stream = io.TextIOWrapper(io.BytesIO(file_bytes), encoding="latin-1")

    for row in csv.reader(text_stream):
        if any(cell.strip() for cell in row):
            yield row

sheets.py

1. auth is flaky and brittle

app/sheets.py#L21-L90

two completely different code paths for local vs hosted auth, no tests for either. hosted mode doesn't persist refreshed tokens so it re-authenticates on every request. no confidence any of this works reliably.

2. except (HttpError, Exception) is redundant

app/sheets.py#L160-L161

except (HttpError, Exception) as e:

Exception is a superclass of HttpError, so this just catches Exception. more importantly, it retries permanent errors (403, 404) the same as transient ones. retrying a 403 three times with backoff is wasted time.

3. duplicate cell limit constant

app/sheets.py#L199-L208

CELL_LIMIT = 10_000_000
if total_cells > CELL_LIMIT:
    raise ValueError(...)

CELL_LIMIT defined inline here and separately in main.py:26. the check in upload_data_to_sheet is dead code in the current flow since main.py already rejects oversized data before this is called. if kept as defensive code, extract the constant so the two can't drift.

4. failed batches produce a sheet with silent gaps

app/sheets.py#L221-L224

except RuntimeError as e:
    print(f"  ✗ Failed batch at row {sheet_row}: {e}")
    failed_batches.append(sheet_row)

failed batches are logged and skipped. the user gets a sheet with missing chunks of data and no indication anything went wrong.


main.py

1. file.filename can be None

app/main.py#L68

if not file.filename.endswith(".csv"):

UploadFile.filename is Optional[str]. if the client doesn't send one, this crashes with AttributeError and returns a 500 instead of a 400.

2. background upload failures are invisible

app/main.py#L29-L38

def background_upload(spreadsheet_id: str, data: list[list]):
    try:
        sheets_service, _ = get_google_services()
        upload_data_to_sheet(sheets_service, spreadsheet_id, data)
    except Exception as e:
        print(f"Background upload failed for sheet {spreadsheet_id}: {e}")

user already has a 202 with a sheet url. if the upload fails, they see an empty sheet with no indication anything went wrong. also, there's no way to track progress. bad design imo.

3. orphaned sheets on failure

app/main.py#L132-L148

spreadsheet_id = create_spreadsheet(...)
make_sheet_public(drive_service, spreadsheet_id)

if make_sheet_public fails, the sheet already exists. the except block returns a 500 but never deletes the orphaned sheet.

4. rows_queued doesn't match reality

app/main.py#L161

"rows_queued": total_rows - 1,

subtracts 1 for the header, but background_upload receives and uploads all rows including the header.

5. manual chunked size check

app/main.py#L76-L99

starlette already supports max_part_size on request.form(), which rejects oversized uploads at the parsing level. instead of manual chunked reading, this could be:

@app.post("/upload")
async def upload_csv(request: Request, background_tasks: BackgroundTasks):
    async with request.form(max_part_size=200 * 1024 * 1024) as form:
        file = form["file"]
        contents = await file.read()
        ...

6. no rate limiting on a public api

not a problem since this is a test project, but something to keep in mind. no rate limiting, no CORS config. anyone can hit /upload and burn through the google api quota.


general

no tests anywhere in this repo. no unit tests, no integration tests, nothing. any change risks breaking existing behavior with zero safety net.

bare except Exception used throughout the codebase. this swallows everything, including bugs you'd want to see. always catch the specific exceptions you expect and let the rest propagate.

docker-compose.yml is unnecessary. this is a single fastapi app with no database, no redis, no separate services. a Dockerfile is all you need.

nothing wrong with using ai to write code, but i'd have expected it to produce tests and catch most of these bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment