Hacker News
- We Evaluated ChatGPT vs. Google on 500 Search Queries https://www.surgehq.ai/blog/googles-existential-threat-chatgpt-matches-googles-performance-on-informational-search-queries-and-smashes-it-on-coding 11 comments
- HellaSwag: 36% of this popular large language model benchmark contains errors https://www.surgehq.ai/blog/hellaswag-or-hellabad-36-of-this-popular-llm-benchmark-contains-errors 8 comments
- Move Over, Google: The TikTokification of Next-Gen Search https://www.surgehq.ai/blog/mover-over-google-the-tiktokification-of-next-gen-search 4 comments
- Evaluation of TikTok vs. Instagram Reels https://www.surgehq.ai/blog/tiktok-vs-instagram-reels-personalized-human-evaluation 263 comments
- 30% of Google's Emotions Dataset Is Mislabeled https://www.surgehq.ai/blog/30-percent-of-googles-reddit-emotions-dataset-is-mislabeled 144 comments
- Generating Children’s Stories Using GPT-3 and DALL·E https://www.surgehq.ai/blog/generating-childrens-stories-using-gpt-3-and-dall-e 145 comments
- I wanted burritos. Facebook Search sent me to a dead restaurant 45m away https://www.surgehq.ai/blog/measuring-facebook-search-its-ai-sent-me-45m-away-for-burritos 33 comments
- Is Elon right? We labeled 500 Twitter users to measure the amount of Spam https://www.surgehq.ai/blog/we-measured-the-percentage-of-spammy-twitter-users 5 comments
- We asked 100 humans to draw the DALL·E prompts https://www.surgehq.ai/blog/humans-vs-dall-e 73 comments
- The average number of ads on a Google Search recipe? 8.7 https://www.surgehq.ai/blog/the-average-number-of-ads-on-a-google-search-recipe-8-7 3 comments
- Three areas where Google Search lags behind competitors: code, cooking, travel https://www.surgehq.ai/blog/google-search-is-falling-behind 347 comments
- Google Search Is Falling Behind https://www.surgehq.ai/blog/google-search-is-falling-behind 5 comments
- Building a no-code toxicity classifier by talking to GitHub Copilot https://www.surgehq.ai/blog/building-a-no-code-toxicity-classifier-by-talking-to-copilot 143 comments
- Are popular toxicity models simply profanity detectors? https://www.surgehq.ai/blog/are-popular-toxicity-models-simply-profanity-detectors 211 comments
- An Analysis of Omicron Tweets: 30% Are Skeptical of the Medical Establishment https://www.surgehq.ai/blog/omicron-tweets-analysis 2 comments
- Is Google Search Deteriorating? Measuring Google's Search Quality in 2022 https://www.surgehq.ai/blog/is-google-search-deteriorating-measuring-search-quality-in-2022 414 comments
- Introduction to Reinforcement Learning with Human Feedback [D] https://www.surgehq.ai/blog/introduction-to-reinforcement-learning-with-human-feedback-rlhf-series-part-1 6 comments machinelearning
- We Evaluated ChatGPT vs. Google on 500 Search Queries https://www.surgehq.ai/blog/googles-existential-threat-chatgpt-matches-googles-performance-on-informational-search-queries-and-smashes-it-on-coding 9 comments languagetechnology
- 36% of HellaSwag benchmark contains errors [D] https://www.surgehq.ai/blog/hellaswag-or-hellabad-36-of-this-popular-llm-benchmark-contains-errors 6 comments machinelearning
- [D] Evaluating Image Generation Intelligence: Did Astral Codex Ten Win His Bet on AI Progress? https://www.surgehq.ai/blog/dall-e-vs-imagen-and-evaluating-astral-codex-tens-3000-ai-bet 6 comments machinelearning
- How Good is Hugging Face's BLOOM? Human Evaluation of Large Language Models [D] https://www.surgehq.ai/blog/how-good-is-hugging-faces-bloom-a-real-world-human-evaluation-of-language-models 28 comments machinelearning
- 30% of Google's Reddit Emotions Dataset is Mislabeled [D] https://www.surgehq.ai/blog/30-percent-of-googles-reddit-emotions-dataset-is-mislabeled 136 comments machinelearning
- Generating Children's Stories Using GPT-3 and DALL·E https://www.surgehq.ai/blog/generating-childrens-stories-using-gpt-3-and-dall-e 6 comments artificial
- Creating and Analyzing a Dataset of Roe v. Wade Tweets Labeled by Abortion Stance [P] https://www.surgehq.ai/blog/dataset-of-roe-v-wade-tweets-labeled-by-abortion-stance 4 comments machinelearning
- Humans vs. DALL·E — Where do human artists fit in a world of rich, creative AI? https://www.surgehq.ai/blog/humans-vs-dall-e 24 comments artificial
- Google Search is Falling Behind https://www.surgehq.ai/blog/google-search-is-falling-behind 2 comments degoogle
- Moving beyond engagement: How could Facebook's algorithms optimize for human values instead? [D] https://www.surgehq.ai/blog/what-if-social-media-optimized-for-human-values 30 comments machinelearning
- Holy $#!t: Are popular toxicity models simply profanity detectors? [D] https://www.surgehq.ai/blog/are-popular-toxicity-models-simply-profanity-detectors 84 comments machinelearning
- Holy $#!t: Are popular toxicity models simply profanity detectors? [OC] https://www.surgehq.ai/blog/are-popular-toxicity-models-simply-profanity-detectors 3 comments deeplearning
- Is Google Search Deteriorating? Measuring Google's Search Quality in 2022 https://www.surgehq.ai/blog/is-google-search-deteriorating-measuring-search-quality-in-2022 40 comments degoogle
- [D] Cohen's kappa — useful? https://www.surgehq.ai/blog/inter-rater-reliability-metrics-understanding-cohens-kappa 4 comments machinelearning
- Death threat or cat meme? Why context matters in machine learning [x-post from r/machinelearning] https://www.surgehq.ai/blog/why-context-aware-datasets-are-crucial-for-data-centric-ai 4 comments artificial