Remarkable 9
This week, among other topics, we explore why inflation is not yet beaten, white-collar work and software stocks under threat, and Deepseek strikes again. Meanwhile, while we weren’t looking, Chinese universities have powered up the rankings..
The latest “Beige Book” from the Federal Reserve - which tracks economic conditions across much of the country - found that businesses are starting to pass tariff costs on to consumers, now that their pre-tariff inventory has been cleaned out. “Cost pressures due to tariffs were a consistent theme across all Districts,” the Federal Reserve wrote in the January report. “Several contacts that initially absorbed tariff-related costs were beginning to pass them on to customers as pre-tariff inventories became depleted or as pressures to preserve margins grew more acute.”
Trump’s tariffs are now being passed on to American consumers, businesses warn
Look back to the early 2000s, and a global university ranking based on scientific output, such as published journal articles, would be very different. Seven American schools would be among the top 10, led by Harvard University at No. 1. Only one Chinese school, Zhejiang University, would even make the top 25. Today, Zhejiang is ranked first on that list, the Leiden Rankings, from the Centre for Science and Technology Studies at Leiden University in the Netherlands. Seven other Chinese schools are in the top 10.
Harvard Slips on a Global Ranking List, as Chinese Schools Surge Ahead
Antiviral drugs for influenza, the best known of which is Tamiflu, are—let’s be honest—not exactly miracle cures. They marginally shorten the course of illness, especially if taken within the first 48 hours. But amid possibly the worst flu season in 25 years, driven by a variant imperfectly matched to the vaccine, these underused drugs can make a bout of flu a little less miserable. So consider an antiviral. And specifically, consider Xofluza, a lesser-known drug that is in fact better than Tamiflu.
The Best Flu Drug Americans Aren’t Taking
With everybody raving about Gemini 3, it got beaten by a familiar foe..
After nine prompts, DeepSeek proved to be the better tool when accuracy and structure matter most. It consistently delivered clean answers, respected constraints and avoided unnecessary verbosity — making it ideal for technical work, structured analysis, instruction-heavy tasks and moments where clarity matters most. Gemini 3 Flash did well, too, performing best when the task benefits from interpretation, explanation or creativity. I have to admit that I’m completely surprised by the results here. DeepSeek has been controversial in the past, but it stands out in many new ways that clearly make it competitive. DeepSeek might just be the chatbot to watch this year.
I tested Gemini 3 Flash vs. DeepSeek with 9 prompts — the winner surprised me | Tom’s Guide
And it might have other tricks up its sleeve:
When an enterprise LLM retrieves a product name, technical specification, or standard contract clause, it’s using expensive GPU computation designed for complex reasoning — just to access static information. This happens millions of times per day. Each lookup wastes cycles and inflates infrastructure costs. DeepSeek’s newly released research on “conditional memory” addresses this architectural limitation directly. The work introduces Engram, a module that separates static pattern retrieval from dynamic reasoning. It delivers results that challenge assumptions about what memory is actually for in neural networks. The paper was co-authored by DeepSeek founder Liang Wenfeng. Through systematic experiments DeepSeek found the optimal balance between computation and memory with 75% of sparse model capacity allocated to dynamic reasoning and 25% to static lookups. This memory system improved reasoning more than knowledge retrieval. Complex reasoning benchmarks jumped from 70% to 74% accuracy, while knowledge-focused tests improved from 57% to 61%. These improvements came from tests including Big-Bench Hard, ARC-Challenge, and MMLU.
DeepSeek has released a new technical paper, which details a new method for how new AI models might rely on a queryable database of information committed to system memory. Named “Engram”, the conditional memory-based technique achieves demonstrably higher performance in long-context queries by committing sequences of data to static memory. This eases the reliance on reasoning for AI models, allowing the GPUs to only handle more complex tasks, increasing performance, and reducing the reliance on high-bandwidth memory (HBM).
But LLMs still hallucinate, although some more than others:
In the arms race to be the most helpful assistant, the surprising winner is the one that knows when to say “I don’t know.” I test AI chatbots daily — using them to write code, summarize long meetings and explain the nuances of quantum physics. But the biggest risk with Large Language Models (LLMs) isn’t what they don’t know; it’s what they pretend to know. All too often chatbots confidently give the wrong answer — and users may not even notice. To see how today’s top models handle a blatant falsehood, I gave them a nonsense test. I invented an idiom that doesn’t exist and asked ChatGPT, Gemini and Claude to define it.
I invented a fake idiom to test AI chatbots — only one called my bluff | Tom’s Guide
The next era in AI is arriving and becoming practical:
AI assistants move from hype to habit as Google, Anthropic, Salesforce, and LG roll out tools that automate work, healthcare, and home chores.
Daily Tech Insider Unpacks AI Assistants’ Leap From Code to Chores
And it might have disturbing consequences for many software stocks and white-collar work:
Software was never the endgame. It was the easiest place to prove the point, because it has cheap verification. Tests pass or they do not. The build is green or red. A deploy works or it pages you. That makes it a perfect environment for agents to iterate, self-correct, and get trained against objective feedback. But most white-collar work is also full of scoreboards. It just hides behind nice titles. Accounting closes or it does not. Claims are approved or denied. Contracts pass compliance review or they do not. A spreadsheet ties out or it does not. A quarterly report reconciles with source systems or it does not. A huge fraction of “knowledge work” is language plus rules plus checklists, sitting inside software systems that already define correctness. The White-Collar Bloodbath Starts Quietly, Then It Cascades The “decimation” does not start with a dramatic moment where everyone gets fired on the same day. It starts with hiring slowing down. Intern classes shrinking. Junior roles vanishing. Teams realizing they can ship the same roadmap with fewer people because agents eat the glue work, the first drafts, the basic implementation, the documentation pass, the test scaffolding, the refactor churn.

