- [R] Measuring Coding Challenge Competence With APPS. GPT fine-tuned on problems from educational coding websites and GitHub can pass approximately 15% of the test cases of introductory problems. https://arxiv.org/abs/2105.09938 11 comments machinelearning
Linking pages
- Language models are nearly AGIs but we don't notice it because we keep shifting the bar https://philosophybear.substack.com/p/language-models-are-nearly-agis-but 55 comments
- GitHub - CodedotAl/gpt-code-clippy: Full description can be found here: https://discuss.huggingface.co/t/pretrain-gpt-neo-for-open-source-github-copilot-model/7678?u=ncoop57 https://github.com/CodedotAl/gpt-code-clippy 13 comments
- AI Could Soon Write Code Based on Ordinary Language | WIRED https://www.wired.com/story/ai-write-code-ordinary-language/ 8 comments
- Salesforce's CodeT5 system can understand and generate code | VentureBeat https://venturebeat.com/2021/09/07/salesforces-codet5-system-can-understand-and-generate-code/ 1 comment
- AI Weekly: The promise and limitations of machine programming tools | VentureBeat https://venturebeat.com/2021/06/18/ai-weekly-the-promise-and-limitations-of-machine-programming-tools/ 0 comments
- Prediction Market FAQ - by Scott Alexander https://astralcodexten.substack.com/p/prediction-market-faq 0 comments
- GitHub - MLGroupJLU/LLM-eval-survey: The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models". https://github.com/MLGroupJLU/LLM-eval-survey 0 comments
- GitHub - lmmlzn/Awesome-LLMs-Datasets: Summarize existing representative LLMs text datasets. https://github.com/lmmlzn/Awesome-LLMs-Datasets 0 comments
- Evaluating LLM Benchmarks for React | KiloBytes by KB https://kshitij-banerjee.github.io/2024/05/04/evaluating-llm-benchmarks-for-react/ 0 comments
Would you like to stay up to date with Computer science? Checkout Computer science
Weekly.
Related searches:
Search whole site: site:arxiv.org
Search title: [2105.09938] Measuring Coding Challenge Competence With APPS
See how to search.