New functionality must follow the red-green-refactor cycle:
1. Write failing tests first
2. Confirm they fail before implementing
3. Implement until tests pass
4. Run full suite to check for regressions

2026-06-02 09:31:11 +02:00

5.5 KiB

Raw Blame History

Timesheets — Agent Guide

Read README.md first for a full description of the project, its subcommands, config file format, and Joplin notebook structure.

Package layout

timesheets/
├── pyproject.toml
├── README.md
├── AGENTS.md
├── timesheets.example.toml
└── src/timesheets/
    ├── cli.py        # argument parsing, main() entry point
    ├── parser.py     # markdown table parsing, aggregation, date filtering
    ├── projects.py   # project_map.json loading and key resolution
    ├── output.py     # CSV writing, summary, stories, and status printing
    ├── config.py     # TOML config file loading and key extraction
    ├── joplin.py     # Joplin API integration (notebook traversal, note fetching)
    ├── status.py     # day/week status calculations
    └── utils.py      # shared low-level helpers (duration parsing, formatting, etc.)

Tests live in tests/, one file per source module:

tests/
├── test_utils.py
├── test_parser.py
├── test_projects.py
├── test_config.py
├── test_output.py
├── test_joplin.py
└── test_status.py

Package manager — uv

All dependency management and script execution is done via uv. Do not use pip or python directly.

Task	Command
Install / sync dependencies	`uv sync`
Add a runtime dependency	`uv add <package>`
Add a dev-only dependency	`uv add --dev <package>`
Run the CLI	`uv run timesheets <args>`
Run any Python script	`uv run python <script>`

Testing

The test suite uses pytest with pytest-cov for coverage reporting.

# Run all tests
uv run pytest

# Run with coverage report
uv run pytest --cov

# Run a specific test file
uv run pytest tests/test_parser.py

# Run a specific test
uv run pytest tests/test_parser.py::TestParseTable::test_empty_input

Rules for adding or changing functionality

Use test-driven development (TDD) for new functionality. The workflow is:
1. Write the tests first, based on the expected behaviour.
2. Confirm the new tests fail before writing any implementation:
```
uv run pytest tests/test_<module>.py
```
3. Implement the functionality until all tests pass.
4. Run the full suite to confirm nothing regressed:
```
uv run pytest --cov
```
Skipping the "confirm it fails" step defeats the purpose of TDD — a test that never fails gives no confidence.
Always update or add tests when modifying existing behaviour. Tests live in the tests/ file that corresponds to the module being changed (e.g. changes to parser.py → tests/test_parser.py).
Do not reduce coverage. Every new function or branch should have at least one test covering the happy path. Edge cases and error paths should be covered where the logic is non-trivial.
cli.py is intentionally excluded from unit tests — it is thin glue code. All logic worth testing belongs in the other modules.
Joplin integration tests in test_joplin.py must mock ClientApi — do not require a live Joplin instance.

Behavioral Guidelines

Behavioral guidelines to reduce common LLM coding mistakes. Merge with project-specific instructions as needed.

Tradeoff: These guidelines bias toward caution over speed. For trivial tasks, use judgment.

1. Think Before Coding

Don't assume. Don't hide confusion. Surface tradeoffs.

Before implementing:

State your assumptions explicitly. If uncertain, ask.
If multiple interpretations exist, present them - don't pick silently.
If a simpler approach exists, say so. Push back when warranted.
If something is unclear, stop. Name what's confusing. Ask.

2. Simplicity First

Minimum code that solves the problem. Nothing speculative.

No features beyond what was asked.
No abstractions for single-use code.
No "flexibility" or "configurability" that wasn't requested.
No error handling for impossible scenarios.
If you write 200 lines and it could be 50, rewrite it.

Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.

3. Surgical Changes

Touch only what you must. Clean up only your own mess.

When editing existing code:

Don't "improve" adjacent code, comments, or formatting.
Don't refactor things that aren't broken.
Match existing style, even if you'd do it differently.
If you notice unrelated dead code, mention it - don't delete it.

When your changes create orphans:

Remove imports/variables/functions that YOUR changes made unused.
Don't remove pre-existing dead code unless asked.

The test: Every changed line should trace directly to the user's request.

4. Goal-Driven Execution

Define success criteria. Loop until verified.

Transform tasks into verifiable goals:

"Add validation" → "Write tests for invalid inputs, then make them pass"
"Fix the bug" → "Write a test that reproduces it, then make it pass"
"Refactor X" → "Ensure tests pass before and after"

For multi-step tasks, state a brief plan:

1. [Step] → verify: [check]
2. [Step] → verify: [check]
3. [Step] → verify: [check]

Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.

These guidelines are working if: fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.

5.5 KiB Raw Blame History