Gettings my news in Obsidian

I created a simple implementation which pulls a list of RSS feeds and formats them into a nice markdown file:

Pasted image 20231130214125.png

I’ve got a couple todos to make this nicer.

  1. Work on logistics, like how frequently to update and (if it’s more frequent than daily) how to handle updating the news file.
  2. Collect more sources
  3. Make a pipeline for marking news articles for publication. Then at the end of the day, the news articles of interest go onto my blog.
  4. Use an LLM (Maybe Orca 2 13B?) to summarize and/or filter news. Summarization is straightforward, but filtering is not. What kind of criteria should I set? I think it would be easier to make it filter my Twitter feed than to filter, say, Hacker News.

I’m happy with this though. It’s a pretty clean way to render a limited amount of news. I don’t want an endlessly scrolling feed, and I don’t want links to comment sections. I just want small set of articles that I can choose to read.

Local LLMs are catching up to GPT fast. How?

First day perusing my RSS feed, I noticed this article about the gap between open source LLMs and GPT. It’s actually nice how it breaks down the best open source models per dataset, but I suspect it’s already out of date; the new Qwen and Deepseek ~70B models are supposed to be the new open source champs on datasets like MMLU. They may even be better than GPT3.5. The gap is getting pretty narrow.

I want to look into these models more. There were some interesting tweets about them. One comment of interest regarded the training schedule:

The more interesting comment to me was about the dataset.

Anyways, I might try these out once the gguf quantizations are available, but I’m not rushing. There’s always the possibility they are overbaked or otherwise not as good as the evals lead us to believe. Usually the local model community figures it out fast. If they are good, I expect we’ll see some better finetunes coming shortly.