| 1. | | Why scikit learn's fit transform is probably not for you (stephantul.github.io) |
| 1 point by stephantul 11 days ago | past | discuss |
|
| 2. | | Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep (github.com/minishlab) |
| 8 points by stephantul 29 days ago | past |
|
| 3. | | Show HN: Semble – Fast code search for agents with near-transformer accuracy (github.com/minishlab) |
| 7 points by stephantul 36 days ago | past |
|
| 4. | | Show HN: Skeletoken, a Python package for editing model tokenizers (github.com/stephantul) |
| 1 point by stephantul 3 months ago | past |
|
| 5. | | Show HN: PyNIFE. 400-900× speedup for embedding-based retrieval pipelines (github.com/stephantul) |
| 2 points by stephantul 6 months ago | past |
|
| 6. | | Show HN: Skeletoken, a Package for Editing Tokenizers (github.com/stephantul) |
| 1 point by stephantul 8 months ago | past |
|
| 7. | | Turning any tokenizer into a greedy one (stephantul.github.io) |
| 2 points by stephantul 9 months ago | past | 1 comment |
|
| 8. | | Decasing Transformers for Fun (stephantul.github.io) |
| 3 points by stephantul 10 months ago | past | 1 comment |
|
| 9. | | Model2Vec as a Fasttext Alternative (minish.ai) |
| 5 points by stephantul 10 months ago | past | 1 comment |
|
| 10. | | Using overloads to handle union return types in Python (stephantul.github.io) |
| 1 point by stephantul on March 29, 2025 | past | 1 comment |
|
| 11. | | Ask HN: Favourite resources for learning programming type theory? |
| 6 points by stephantul on March 19, 2025 | past | 8 comments |
|
| 12. | | Evaluating ML classifiers using relative error instead of absolute accuracy (stephantul.github.io) |
| 1 point by stephantul on March 13, 2025 | past |
|
| 13. | | Defeat stringly typing without making your users unhappy (stephantul.github.io) |
| 2 points by stephantul on March 7, 2025 | past |
|
| 14. | | Distilling ModernBERT into a static model doesn't work (minishlab.github.io) |
| 5 points by stephantul on Jan 29, 2025 | past | 3 comments |
|
| 15. | | Show HN: SemHash – Fast Semantic Text Deduplication for Cleaner Datasets (github.com/minishlab) |
| 6 points by stephantul on Jan 19, 2025 | past |
|
| 16. | | Train faster static embedding models with sentence transformers (huggingface.co) |
| 52 points by stephantul on Jan 15, 2025 | past | 1 comment |
|
| 17. | | Semhash: Fast deduplication and dataset multitool in Python (minishlab.github.io) |
| 3 points by stephantul on Jan 13, 2025 | past | 1 comment |
|
| 18. | | Model2Vec: Make sentence transformers 500x faster on CPU, 15x smaller (huggingface.co) |
| 5 points by stephantul on Oct 16, 2024 | past |
|
| 19. | | Show HN: Model2Vec: make sentence transformers 500x faster on CPU, 15x smaller (github.com/minishlab) |
| 9 points by stephantul on Sept 29, 2024 | past | 2 comments |
|
| 20. | | Show HN: Model2Vec: make sentence transformers 500x faster on CPU, 15x smaller (github.com/minishlab) |
| 6 points by stephantul on Sept 22, 2024 | past | 2 comments |
|