From 945d4b4572879b4b33d15c9e6617791c85624e57 Mon Sep 17 00:00:00 2001 From: Tobi Lutke Date: Sun, 21 Dec 2025 13:10:35 -0400 Subject: [PATCH] Add 6 synthetic evaluation documents MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Topics covered: - API design principles - Startup fundraising memo - Distributed systems overview - Product launch retrospective - Machine learning primer - Remote work policy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 --- test/eval-docs/api-design-principles.md | 73 ++++++++++ .../eval-docs/distributed-systems-overview.md | 92 +++++++++++++ test/eval-docs/machine-learning-primer.md | 125 ++++++++++++++++++ .../eval-docs/product-launch-retrospective.md | 77 +++++++++++ test/eval-docs/remote-work-policy.md | 123 +++++++++++++++++ test/eval-docs/startup-fundraising-memo.md | 86 ++++++++++++ 6 files changed, 576 insertions(+) create mode 100644 test/eval-docs/api-design-principles.md create mode 100644 test/eval-docs/distributed-systems-overview.md create mode 100644 test/eval-docs/machine-learning-primer.md create mode 100644 test/eval-docs/product-launch-retrospective.md create mode 100644 test/eval-docs/remote-work-policy.md create mode 100644 test/eval-docs/startup-fundraising-memo.md diff --git a/test/eval-docs/api-design-principles.md b/test/eval-docs/api-design-principles.md new file mode 100644 index 0000000..e628e4b --- /dev/null +++ b/test/eval-docs/api-design-principles.md @@ -0,0 +1,73 @@ +# API Design Principles + +## Introduction + +Good API design is crucial for developer experience. This document outlines the core principles we follow when designing REST APIs. + +## Principle 1: Use Nouns, Not Verbs + +URLs should represent resources, not actions. Use HTTP methods to indicate the action. + +**Good:** +- GET /users/123 +- POST /orders +- DELETE /products/456 + +**Bad:** +- GET /getUser?id=123 +- POST /createOrder +- GET /deleteProduct/456 + +## Principle 2: Use Plural Nouns + +Always use plural nouns for consistency. + +- /users (not /user) +- /orders (not /order) +- /products (not /product) + +## Principle 3: Hierarchical Relationships + +Express relationships through URL hierarchy. + +- GET /users/123/orders - Get all orders for user 123 +- GET /users/123/orders/456 - Get specific order 456 for user 123 + +## Principle 4: Filtering and Pagination + +Use query parameters for filtering, sorting, and pagination. + +- GET /products?category=electronics&sort=price&page=2&limit=20 + +## Principle 5: Versioning + +Always version your APIs. We prefer URL versioning. + +- /v1/users +- /v2/users + +## Principle 6: Error Handling + +Return consistent error responses with appropriate HTTP status codes. + +```json +{ + "error": { + "code": "VALIDATION_ERROR", + "message": "Email format is invalid", + "field": "email" + } +} +``` + +## Principle 7: Rate Limiting + +Implement rate limiting and communicate limits via headers: + +- X-RateLimit-Limit: 1000 +- X-RateLimit-Remaining: 999 +- X-RateLimit-Reset: 1640000000 + +## Conclusion + +Following these principles leads to APIs that are intuitive, consistent, and easy to maintain. Remember: the best API is one that developers can use without reading documentation. diff --git a/test/eval-docs/distributed-systems-overview.md b/test/eval-docs/distributed-systems-overview.md new file mode 100644 index 0000000..b2073dd --- /dev/null +++ b/test/eval-docs/distributed-systems-overview.md @@ -0,0 +1,92 @@ +# Distributed Systems: A Practical Overview + +## What Makes a System "Distributed"? + +A distributed system is a collection of independent computers that appears to users as a single coherent system. The key challenges arise from: + +1. **Partial failure** - Parts of the system can fail independently +2. **Unreliable networks** - Messages can be lost, delayed, or duplicated +3. **No global clock** - Different nodes have different views of time + +## The CAP Theorem + +Eric Brewer's CAP theorem states that a distributed system can only provide two of three guarantees: + +- **Consistency**: All nodes see the same data at the same time +- **Availability**: Every request receives a response +- **Partition tolerance**: System continues operating despite network partitions + +In practice, network partitions happen, so you're really choosing between CP and AP systems. + +### CP Systems (Consistency + Partition Tolerance) +- Examples: ZooKeeper, etcd, Consul +- Sacrifice availability during partitions +- Good for: coordination, leader election, configuration + +### AP Systems (Availability + Partition Tolerance) +- Examples: Cassandra, DynamoDB, CouchDB +- Sacrifice consistency during partitions +- Good for: high-throughput, always-on services + +## Consensus Algorithms + +When nodes need to agree on something, they use consensus algorithms. + +### Paxos +- Original consensus algorithm by Leslie Lamport +- Notoriously difficult to understand and implement +- Foundation for many other algorithms + +### Raft +- Designed to be understandable +- Used in etcd, Consul, CockroachDB +- Separates leader election from log replication + +### PBFT (Practical Byzantine Fault Tolerance) +- Handles malicious nodes +- Used in blockchain systems +- Higher overhead than crash-fault-tolerant algorithms + +## Replication Strategies + +### Single-Leader Replication +- One node accepts writes +- Followers replicate from leader +- Simple but leader is bottleneck + +### Multi-Leader Replication +- Multiple nodes accept writes +- Must handle write conflicts +- Good for multi-datacenter deployments + +### Leaderless Replication +- Any node accepts writes +- Uses quorum reads/writes +- Examples: Dynamo-style databases + +## Consistency Models + +From strongest to weakest: + +1. **Linearizability** - Operations appear instantaneous +2. **Sequential consistency** - Operations appear in some sequential order +3. **Causal consistency** - Causally related operations appear in order +4. **Eventual consistency** - Given enough time, all replicas converge + +## Partitioning (Sharding) + +Distributing data across nodes: + +### Hash Partitioning +- Hash key to determine partition +- Even distribution +- Range queries are inefficient + +### Range Partitioning +- Ranges of keys on different nodes +- Good for range queries +- Risk of hot spots + +## Conclusion + +Building distributed systems requires understanding these fundamental concepts. Start simple, add complexity only when needed, and always plan for failure. diff --git a/test/eval-docs/machine-learning-primer.md b/test/eval-docs/machine-learning-primer.md new file mode 100644 index 0000000..912a6a9 --- /dev/null +++ b/test/eval-docs/machine-learning-primer.md @@ -0,0 +1,125 @@ +# Machine Learning: A Beginner's Guide + +## What is Machine Learning? + +Machine learning is a subset of artificial intelligence where systems learn patterns from data rather than being explicitly programmed. Instead of writing rules, you provide examples and let the algorithm discover the rules. + +## Types of Machine Learning + +### Supervised Learning + +The algorithm learns from labeled examples. + +**Classification**: Predicting categories +- Email spam detection +- Image recognition +- Medical diagnosis + +**Regression**: Predicting continuous values +- House price prediction +- Stock price forecasting +- Temperature prediction + +Common algorithms: +- Linear Regression +- Logistic Regression +- Decision Trees +- Random Forests +- Support Vector Machines (SVM) +- Neural Networks + +### Unsupervised Learning + +The algorithm finds patterns in unlabeled data. + +**Clustering**: Grouping similar items +- Customer segmentation +- Document categorization +- Anomaly detection + +**Dimensionality Reduction**: Simplifying data +- Feature extraction +- Visualization +- Noise reduction + +Common algorithms: +- K-Means Clustering +- Hierarchical Clustering +- Principal Component Analysis (PCA) +- t-SNE + +### Reinforcement Learning + +The algorithm learns through trial and error, receiving rewards or penalties. + +Applications: +- Game playing (AlphaGo, chess) +- Robotics +- Autonomous vehicles +- Resource management + +## The Machine Learning Pipeline + +1. **Data Collection**: Gather relevant data +2. **Data Cleaning**: Handle missing values, outliers +3. **Feature Engineering**: Create useful features +4. **Model Selection**: Choose appropriate algorithm +5. **Training**: Fit model to training data +6. **Evaluation**: Test on held-out data +7. **Deployment**: Put model into production +8. **Monitoring**: Track performance over time + +## Key Concepts + +### Overfitting vs Underfitting + +**Overfitting**: Model memorizes training data, performs poorly on new data +- Solution: More data, regularization, simpler model + +**Underfitting**: Model too simple to capture patterns +- Solution: More features, complex model, less regularization + +### Train/Test Split + +Never evaluate on training data. Common splits: +- 80% training, 20% testing +- 70% training, 15% validation, 15% testing + +### Cross-Validation + +K-fold cross-validation provides more robust evaluation: +1. Split data into K folds +2. Train on K-1 folds, test on remaining fold +3. Repeat K times +4. Average the results + +### Bias-Variance Tradeoff + +- **High Bias**: Oversimplified model (underfitting) +- **High Variance**: Overcomplicated model (overfitting) +- Goal: Find the sweet spot + +## Evaluation Metrics + +### Classification +- Accuracy: Correct predictions / Total predictions +- Precision: True positives / Predicted positives +- Recall: True positives / Actual positives +- F1 Score: Harmonic mean of precision and recall +- AUC-ROC: Area under receiver operating curve + +### Regression +- Mean Absolute Error (MAE) +- Mean Squared Error (MSE) +- Root Mean Squared Error (RMSE) +- R-squared (R2) + +## Getting Started + +1. Learn Python and libraries (NumPy, Pandas, Scikit-learn) +2. Work through classic datasets (Iris, MNIST, Titanic) +3. Take online courses (Coursera, fast.ai) +4. Practice on Kaggle competitions +5. Build projects with real-world data + +Remember: Machine learning is 80% data preparation and 20% modeling. Start with clean data and simple models before going complex. diff --git a/test/eval-docs/product-launch-retrospective.md b/test/eval-docs/product-launch-retrospective.md new file mode 100644 index 0000000..8d7d394 --- /dev/null +++ b/test/eval-docs/product-launch-retrospective.md @@ -0,0 +1,77 @@ +# Product Launch Retrospective: Project Phoenix + +**Date:** November 2024 +**Facilitator:** Product Team +**Attendees:** Engineering, Design, Marketing, Sales + +## Context + +Project Phoenix was our Q3 initiative to launch a new analytics dashboard. The feature shipped on September 15th after a 4-month development cycle. + +## What Went Well + +### 1. Cross-functional Collaboration +The weekly sync between engineering, design, and product prevented misalignment. Design reviews caught issues early, saving significant rework. + +### 2. Beta Program +Our 20-customer beta program identified 47 bugs before launch. Customer feedback directly shaped the final UI. + +### 3. Documentation +Engineering wrote comprehensive API docs. The developer portal received positive feedback from partners. + +### 4. Launch Metrics +- Day 1 adoption: 34% of active users +- Week 1 retention: 67% +- NPS from early users: +42 + +## What Could Have Gone Better + +### 1. Timeline Pressure +The original June deadline was unrealistic. We cut corners on test coverage (only 62% vs. our 80% target). + +### 2. Performance Issues +Initial load time was 4.2 seconds. We had to hotfix performance optimizations in week 2. + +### 3. Mobile Experience +Mobile was deprioritized. The responsive design has usability issues on smaller screens. + +### 4. Sales Enablement +Sales team wasn't trained until launch day. Early deals had inconsistent positioning. + +## Key Metrics Post-Launch + +| Metric | Target | Actual | Status | +|--------|--------|--------|--------| +| MAU | 10,000 | 12,400 | Exceeded | +| Avg Session Duration | 5 min | 7.2 min | Exceeded | +| Error Rate | <0.1% | 0.3% | Missed | +| Support Tickets | <50/week | 73/week | Missed | + +## Action Items + +1. **Testing**: Establish minimum 75% coverage for all new features + - Owner: Engineering Lead + - Due: December 1st + +2. **Performance Budget**: Add performance gates to CI/CD + - Owner: Platform Team + - Due: December 15th + +3. **Mobile-First**: Require mobile designs before development starts + - Owner: Design Lead + - Due: Immediate + +4. **Sales Training**: Build 2-week lead time for enablement + - Owner: Product Marketing + - Due: Next launch + +## Lessons Learned + +1. Beta programs are invaluable - expand to 30+ customers +2. Performance testing should be part of definition of done +3. Cross-functional alignment works - keep the weekly syncs +4. Documentation pays off - developers loved the API docs + +## Follow-up + +Schedule 30-day post-launch review for October 15th to assess long-term adoption patterns. diff --git a/test/eval-docs/remote-work-policy.md b/test/eval-docs/remote-work-policy.md new file mode 100644 index 0000000..b5c6f3d --- /dev/null +++ b/test/eval-docs/remote-work-policy.md @@ -0,0 +1,123 @@ +# Remote Work Policy + +**Effective Date:** January 2024 +**Last Updated:** March 2024 +**Applies To:** All full-time employees + +## Overview + +We believe in flexibility and trust. This policy outlines expectations for remote work arrangements to ensure productivity while maintaining work-life balance. + +## Eligibility + +All full-time employees who have completed their 90-day probationary period are eligible for remote work. Some roles requiring physical presence (e.g., office management, hardware engineering) may have modified arrangements. + +## Work Arrangements + +### Fully Remote +- Work from anywhere within approved time zones +- No requirement to visit office +- Must attend quarterly in-person gatherings + +### Hybrid +- 2-3 days per week in office +- Flexible scheduling with manager approval +- Core collaboration days: Tuesday and Thursday + +### Office-Based +- Primary work location is company office +- Occasional remote days allowed with notice + +## Expectations + +### Availability +- Be available during core hours: 10 AM - 3 PM local time +- Respond to messages within 2 hours during work hours +- Block focus time on calendar if needed + +### Communication +- Camera on for team meetings +- Update Slack status when away +- Share working hours in calendar + +### Workspace Requirements +- Reliable internet connection (minimum 25 Mbps) +- Quiet space for video calls +- Ergonomic setup (we provide $500 home office stipend) + +### Security +- Use company VPN for all work +- Lock computer when stepping away +- No work on public WiFi without VPN +- Report lost devices immediately + +## Equipment + +Company provides: +- Laptop (MacBook Pro or equivalent) +- Monitor (up to 27") +- Keyboard and mouse +- Headset for calls + +$500 annual stipend for: +- Desk and chair +- Lighting +- Other ergonomic equipment + +## Time Zones + +### Approved Time Zones +- Americas: UTC-8 to UTC-3 +- Europe: UTC-1 to UTC+3 +- Asia-Pacific: UTC+8 to UTC+12 + +Work outside approved time zones requires VP approval. + +### Async First +- Default to async communication +- Document decisions in writing +- Use Loom for complex explanations +- Reserve meetings for collaboration, not status updates + +## International Remote Work + +Extended stays (>30 days) in another country require: +1. HR approval +2. Tax implications review +3. Legal compliance check +4. Updated work agreement + +Some countries are not permitted due to legal/tax complexity. + +## In-Person Requirements + +### Mandatory Events +- Annual company retreat (1 week) +- Quarterly team gatherings (2 days) +- Onboarding week for new hires + +### Travel +- Company covers all travel expenses +- Book through approved travel platform +- Submit expenses within 30 days + +## Performance + +Remote work is a privilege maintained through: +- Meeting deadlines and commitments +- Responsive communication +- Quality of work output +- Team collaboration + +Performance issues may result in modified arrangements. + +## Manager Responsibilities + +- Regular 1:1 meetings (weekly recommended) +- Clear goal setting and feedback +- Inclusive meeting scheduling across time zones +- Address isolation or burnout proactively + +## Questions + +Contact HR or your manager for questions about this policy. We review and update this policy annually based on feedback. diff --git a/test/eval-docs/startup-fundraising-memo.md b/test/eval-docs/startup-fundraising-memo.md new file mode 100644 index 0000000..c334f3c --- /dev/null +++ b/test/eval-docs/startup-fundraising-memo.md @@ -0,0 +1,86 @@ +# Series A Fundraising Strategy Memo + +**To:** Leadership Team +**From:** CEO +**Date:** March 2024 +**Subject:** Series A Planning and Timeline + +## Executive Summary + +We are targeting a $15M Series A raise at a $60M pre-money valuation. This memo outlines our strategy, timeline, and key milestones. + +## Current Metrics + +- ARR: $2.4M (growing 15% MoM) +- Customers: 127 paying companies +- Net Revenue Retention: 124% +- Burn Rate: $350K/month +- Runway: 14 months + +## Target Investors + +We're focusing on three tiers: + +### Tier 1 (Lead candidates) +- Sequoia Capital - Strong enterprise SaaS focus +- Andreessen Horowitz - Previous interest from partner +- Index Ventures - European expansion thesis fits + +### Tier 2 (Co-investors) +- First Round Capital +- Founder Collective +- SV Angel + +### Tier 3 (Strategic) +- Salesforce Ventures +- Google Ventures + +## Timeline + +**April 2024** +- Finalize data room +- Update financial model +- Prepare pitch deck v2 + +**May 2024** +- Warm introductions begin +- First partner meetings +- Initial term sheets expected + +**June 2024** +- Partner meetings continue +- Negotiate terms +- Select lead investor + +**July 2024** +- Due diligence +- Legal documentation +- Close round + +## Use of Funds + +- Engineering (50%): Scale team from 8 to 20 +- Sales (30%): Build outbound motion, hire 5 AEs +- Marketing (15%): Brand, content, events +- G&A (5%): Operations infrastructure + +## Key Risks + +1. **Market timing** - Enterprise budgets tightening +2. **Competition** - Two well-funded competitors announced +3. **Valuation expectations** - Market multiples compressed + +## Board Composition + +Post-Series A board will be: +- 2 Founders +- 1 Lead investor +- 1 Independent (to be recruited) + +## Next Steps + +1. Schedule strategy session for April 5th +2. CFO to update financial model by April 10th +3. Begin investor outreach April 15th + +Questions? Let's discuss at our next all-hands.