Understand how useful in working with the metaclasses in Python.
This is the first of a three-part series by Markus Eisele. Stay tuned for the follow-up posts. AI is everywhere right now. Every conference, keynote, and internal meeting has someone showing a prototype powered by a large language model. It looks impressive. You ask a question, and the system answers in natural language. But if […]
Anthropic is starting to train its models on new Claude chats. If you’re using the bot and don’t want your chats used as training data, here’s how to opt out.
On Thursday, I published a story about the police-tech giant Flock Safety selling its drones to the private sector to track shoplifters. Keith Kauffman, a former police chief who now leads Flock’s drone efforts, described the ideal scenario: A security team at a Home Depot, say, launches a drone from the roof that follows shoplifting…
New Claude, AI shopping, 10B robots, AI saves babies, fake AI travel, and more...
Explosive growth of AI data centers is expected to increase greenhouse gas emissions. Researchers are now seeking solutions to reduce these environmental harms.
The platform appears to closely resemble TikTok and is powered by Sora 2, OpenAI's latest video generation model.
DeepSeek continues to push the frontier of generative AI...in this case, in terms of affordability.The company has unveiled its latest experimental large language model (LLM), DeepSeek-V3.2-Exp, that mostly matches or slightly improves the benchmarks of its predecessor DeepSeek-3.1-Terminus, but more importantly, comes at a 50 percent reduced cost through DeepSeek's application programming interface (API), down to just $0.028 per million input tokens — and can keep costs down even when approaching the context limit of 128,000 tokens (about 300-400 pages worth of information).It's available through DeepSeek's first-party API, as well as the code downloadable under an open-source, enterprise-friendly MIT License on Hugging Face and GitHub.How did the company do it? Read on to find out.API Costs ReducedAs previously mentioned, DeepSeek announced significant reductions in API pricing. For one million tokens, input cache hits now cost $0.028, cache misses $0.28, and outputs $0.42. This compares to $0.07, $0.56, and $1.68, respectively, under the earlier V3.1-Terminus pricing.DeepSeek has kept Terminus temporarily available via a separate API until October 15, allowing developers to directly compare the two models, but Terminus will be deprecated after that — a short lived model that was released just one week ago.Still, DeepSeek V3.2-Exp appears to be among the cheapest options for developers through the API, though OpenAI's GPT-5 Nano still easily takes the crown for most affordable. Take a look at it in comparison to other leading models below:ProviderModel (cheap/entry)Input Price (per 1M tokens)Output Price (per 1M tokens)Notes / caveatsDeepSeekV3.2-Exp$0.28 / $0.028 cached input$0.42OpenAIGPT-5 Nano$0.05 / $0.005 cached input$0.40GoogleGemini 2.5 Flash-Lite$0.10 $0.40No cached input price availableAnthropicClaude Haiku 3.5$0.80 / $0.08 for cached input$4.00xAIGrok-4 Fast Non-Reasoning$0.20 / $0.05 for cached input$0.50New Sparse Attention DesignAt the heart of V3.2-Exp is DeepSeek Sparse Attention, or DSA, described in a technical report also released by the company today on Github.Traditional dense attention mechanisms, which calculate interactions between every token and every other token in a sequence, scale quadratically with sequence length. As the number of tokens grows, this results in rapidly increasing memory use and compute requirements, leading to high costs and slow inference.Most large language models use a “dense” self-attention mechanism, which compares every token in the input to every other token. So if your prompt doubles in length, the model does more than double the work to handle all those cross-token interactions. This drives up GPU time and energy cost, which is reflected in the per-million-token pricing for APIs. During prefill, the amount of computation grows roughly with the square of the context length, and at least linearly during decoding. As a result, longer sequences — tens of thousands or even over 100,000 tokens — cause costs to rise much faster than the token count alone would suggest.DSA addresses this by using a “lightning indexer” to select only the most relevant tokens for attention. This reduces the computational load while preserving nearly the same quality of responses.By reducing the compute burden per token at large context lengths, V3.2-Exp keeps the cost curve flatter and much lower. This makes it far more practical and affordable to run long-context workloads such as document-scale summarization, multi-turn chat with long histories, or code analysis without facing a runaway increase in inference costs.Post-Training and Reinforcement Learning AdvancesBeyond its architectural changes, DeepSeek-V3.2-Exp introduces refinements in the post-training process. The company employs a two-step approach: specialist distillation and reinforcement learning.Specialist distillation begins with training separate models for mathematics, competitive programming, logical reasoning, agentic coding, and agentic search. These specialists, fine-tuned from the same base checkpoint, are reinforced with large-scale training to generate domain-specific data. That data is then distilled back into the final checkpoint, ensuring the consolidated model benefits from specialist knowledge while remaining general-purpose.The reinforcement learning phase marks a significant shift. Instead of the multi-stage approach used in previous DeepSeek models, reasoning, agent, and human alignment training are merged into a single RL stage using Group Relative Policy Optimization (GRPO). This unified process balances performance across domains while avoiding the “catastrophic forgetting” issues often associated with multi-stage pipelines.The reward design blends rule-based outcome signals, length penalties, and language consistency checks with a generative reward model guided by task-specific rubrics. Experimental results show that the distilled and reinforced model performs nearly on par with domain-specific specialists, with the gap effectively closed after RL training.Benchmarks SteadyBenchmarking confirms the trade-off works as intended. On widely used public evaluations, V3.2-Exp performs on par with V3.1-Terminus, showing negligible differences in areas such as reasoning, coding, and question answering. While scores dipped slightly in some reasoning-heavy tasks such as GPQA-Diamond and Humanity’s Last Exam, the model’s efficiency gains and consistent performance elsewhere suggest the sparse approach does not substantially compromise capability.MMLU-Pro holds steady at 85.0, AIME 2025 slightly improves to 89.3, while GPQA-Diamond dips from 80.7 to 79.9. Coding and agent benchmarks tell a similar story, with Codeforces ratings rising from 2046 to 2121 and BrowseComp improving from 38.5 to 40.1.This balance reflects the design trade-off. By selecting only a fraction of possible tokens for attention, DSA reduces computational costs significantly. Inference cost comparisons show V3.2-Exp requires less than half the cost per million tokens of V3.1-Terminus when running on long contexts.Open-Source Access and Deployment OptionsIn keeping with the company’s open approach, DeepSeek has released the V3.2-Exp model weights on Hugging Face under the MIT License. Researchers and enterprises can freely download, modify, and deploy the model for commercial use.The release is accompanied by open-source kernels: TileLang for research prototyping and CUDA/FlashMLA kernels for high-performance inference. LMSYS Org, the team behind SGLang, also announced that its framework now officially supports V3.2 with optimized sparse attention kernels, dynamic key-value caching, and scaling to 128,000 tokens. vLLM provides day-one support as well.For local deployment, DeepSeek has provided updated demo code, along with Docker images compatible with NVIDIA H200s, AMD MI350s, and NPUs. The model, at 685 billion parameters, supports multiple tensor types including BF16, FP8, and FP32.Background: DeepSeek’s Iterative PushThe launch of V3.2-Exp comes just one week after DeepSeek released V3.1-Terminus, a refinement of its V3.1 model. Terminus was designed to address user feedback, improving tool-based reasoning and reducing language-mixing errors, such as inserting Chinese words into English responses.According to reporting from VentureBeat, Terminus builds on the V3 family introduced in December 2024, which positioned DeepSeek’s models as versatile, cost-efficient alternatives to its more reasoning-heavy R1 series. While R1 excels in structured logic, math, and multi-step reasoning, it is slower and more expensive. V3 models, by contrast, are built for general-purpose applications such as writing, summarization, customer-facing chat, and basic coding.With V3.2-Exp, DeepSeek is layering in architectural innovation through sparse attention while keeping the MIT License and open-source release model intact.Considerations for Enterprise Decision-MakersFor enterprises—especially those in the U.S.—the cost savings offered by DeepSeek’s API are compelling, but there are additional considerations before adoption.Data security and compliance: Using DeepSeek’s hosted API means data flows through servers operated by a Hong Kong–based company. Enterprises with sensitive customer data, regulated industries, or strict compliance frameworks (e.g., healthcare, finance, defense) will need to carefully assess legal and governance implications. Self-hosting the open-source weights may mitigate these risks, though it shifts infrastructure and maintenance responsibilities in-house.Performance versus control: The API offers immediate access with predictable costs and scaling. Self-hosting provides maximum control—especially over data residency and latency—but requires significant engineering resources and GPU availability. Decision makers must weigh speed of adoption against operational overhead.Vendor diversification: With many U.S.-based enterprises already reliant on OpenAI, Anthropic, or Google, DeepSeek’s open-source approach offers a hedge against vendor lock-in. However, integrating models from a Chinese provider may raise questions from boards or security officers.Total cost of ownership: While the API is cheaper per token, enterprises with steady high-volume workloads may find long-term savings by running the open-source model on their own infrastructure or through trusted third-party hosts. However, based on the model architecture, even those running the new DeepSeek V3.2-Exp should still see considerably lower costs for longer token-count inputs on their own servers and hardware. The choice comes down to scale, workload predictability, and appetite for internal operations.For U.S. decision-makers evaluating DeepSeek, the calculus isn’t just about API pricing—it’s about aligning affordability with risk tolerance, regulatory requirements, and infrastructure strategy.What's Next for DeepSeek?DeepSeek-V3.2-Exp demonstrates how an open-source player can push frontier-scale models while also addressing the practical challenges of cost and deployment. By introducing sparse attention, cutting API prices, merging reinforcement learning into a unified stage, and maintaining full transparency through Hugging Face and GitHub releases, DeepSeek is offering both a research testbed and a viable enterprise option.The addition of frameworks like SGLang and vLLM in the official release ecosystem reinforces that DeepSeek is cultivating broad community integration rather than locking down distribution.At the same time, the experimental nature of V3.2-Exp leaves room for iteration. Internal evaluations show promising results, but DeepSeek acknowledges it is actively testing the architecture in real-world scenarios to uncover any limitations.Whether this experimental architecture becomes the foundation for a broader V3.3 or V4 release remains to be seen. But for now, the launch of V3.2-Exp signals DeepSeek’s determination to stay visible and competitive in the global AI landscape.
A guide to fast video data preprocessing for machine learning
The post Preparing Video Data for Deep Learning: Introducing Vid Prepper appeared first on Towards Data Science.
The counterintuitive approach to AI optimization that's changing how we deploy models
The post I Made My AI Model 84% Smaller and It Got Better, Not Worse appeared first on Towards Data Science.
The next time you order something online, it may be through ChatGPT — at least if OpenAI and online payments provider Stripe have anything to say about it.The two companies today announced a new feature for the world's most popular dedicated chatbot (with 700 million weekly active users globally): Instant Checkout.The feature allows U.S. ChatGPT users in the free, Plus, and Pro subscription tiers to purchase items directly through the familiar chat interface — provided they are logged in with their accounts and usernames.When a user asks a shopping question, such as “best running shoes under $100” or “gifts for a ceramics lover,” ChatGPT will return relevant products from across the web. These results are not sponsored, and ranked on relevance alone, according to OpenAI. If a product supports Instant Checkout, the user can select “Buy" with a new button that will appear in the conversational interface alongside a price.Then, the user can manually confirm the order, shipping, and payment details, and complete the purchase — all without leaving the chat. See a below video of the transaction interface in action posted by OpenAI on YouTube:Payment can be processed with a card already on file for ChatGPT subscribers or through other express options. Orders, payments, and fulfillment are handled entirely by the merchant using their existing systems. ChatGPT acts as an intermediary, securely passing information between buyer and merchant.OpenAI positions Instant Checkout as a way for ChatGPT to move beyond product discovery. In its announcement, the company says this marks the next step in agentic commerce, where ChatGPT not only helps users find what to buy but also enables them to buy it. For shoppers, the process is designed to be seamless, moving from chat to checkout in a few taps. For merchants, it offers a new way to reach ChatGPT’s hundreds of millions of weekly users while keeping full control of payments, systems, and customer relationships.What it means for merchantsMerchants pay a small fee on completed purchases. The service is free for users and does not alter product pricing or influence search rankings within ChatGPT.OpenAI notes that when multiple merchants sell the same product, ranking considers factors such as availability, price, quality, whether the seller is primary, and whether Instant Checkout is enabled.At launch, shoppers in the United States can buy directly from Etsy sellers within ChatGPT. More than a million Shopify merchants, including brands such as Glossier, SKIMS, Spanx, and Vuori, will be added soon. Purchases currently support only single items, but OpenAI plans to add multi-item carts, expand to more regions, and bring additional merchants onto the platform.The Agentic Commerce ProtocolThe system is powered by the Agentic Commerce Protocol (ACP), a new, open standard for AI commerce co-developed by OpenAI and Stripe. OpenAI says ACP lets AI agents, people, and businesses work together to complete purchases securely and efficiently.At the core of this rollout is ACP, the open standard that defines how AI agents and businesses interact to complete transactions. Built with Stripe and with input from merchants, ACP is designed to work across platforms, processors, and business models.Merchants do not need to overhaul their systems to adopt it. They remain the merchant of record throughout the purchase journey, including fulfillment, returns, support, and communication. When an order is placed, ChatGPT transmits the necessary details via ACP to the merchant’s backend. The merchant can then accept or decline the order, process payment with their existing provider, and complete fulfillment as usual.For those already processing payments with Stripe, enabling agentic payments requires minimal coding — OpenAI says as little as one line. Merchants using other processors can still participate by integrating Stripe’s Shared Payment Token API or adopting the Delegated Payments specification included in ACP, without switching payment providers.OpenAI emphasizes that Instant Checkout and ACP are designed with security and control in mind. Users explicitly confirm each step before any action is taken. Payment tokens are encrypted, authorized only for specific merchants and amounts, and require user permission. Only the minimum data necessary to complete a transaction is shared with merchants.How ACP compares with Google’s AP2OpenAI is not alone in trying to standardize how AI agents make payments. Earlier this month, Google announced the Agent Payments Protocol, or AP2, which it developed with more than 60 partners including American Express, Mastercard, PayPal, Salesforce, and ServiceNow.Like ACP, AP2 is an open-source protocol designed to let AI agents securely complete purchases. But while ACP emphasizes keeping merchants in control using their existing processors, AP2 focuses on creating a shared rulebook across the broader digital payments ecosystem. Google’s AP2 introduces the concept of “Mandates,” cryptographically signed digital contracts that serve as verifiable proof of a user’s instructions. These contracts provide an auditable trail that connects a user’s request to the final transaction, supporting both real-time agent-assisted purchases and delegated transactions that may happen later without the user present.While AP2 has backing from a wide range of financial institutions and payment providers, it is not yet available in consumer-facing products. ACP, by contrast, is immediately live in ChatGPT for U.S. shoppers through Etsy and soon Shopify merchants. In effect, ACP is the first to move from specification to deployment, while AP2 aims to become a broader industry standard across multiple platforms and payment networks.Partner perspectivesSeveral partners highlighted what the system means for their businesses in a press release provided by OpenAI to VentureBeat. Will Gaybrick, Stripe’s President of Technology and Business, says Stripe is building the economic infrastructure for AI and is proud to power Instant Checkout and co-develop ACP. Etsy’s Chief Product and Technology Officer, Rafe Colburn, notes that ChatGPT helps Etsy reach buyers even when they are not actively visiting Etsy’s platform. Vanessa Lee, Shopify’s VP of Product, says that by bringing Shopify merchants into ChatGPT, both indie brands and established names can reach high-intent shoppers in new contexts.What’s nextThe current launch is limited to single-item purchases from U.S. Etsy sellers, but OpenAI has broader ambitions. The company says Instant Checkout will expand to support multi-item carts, additional geographies, and more merchants over time. The open-sourcing of ACP is also intended to encourage wider adoption by developers and businesses beyond the initial set of partners.OpenAI frames this release as a step toward a future where AI agents play a central role in commerce. By embedding purchasing capabilities directly into chat, the company is testing how conversational AI can connect people with businesses in the buying process itself.
We've assembled our team of Algorithmic X-Men, seven heroes mapped to seven dependable workhorses of machine learning.
Usually shrouded in mystery at first glance, Python decorators are, at their core, functions wrapped around other functions to provide extra functionality without altering the key logic in the function being "decorated".
Mapping power, concentration, and usage in the emerging AI developer ecosystem
The post MCP in Practice appeared first on Towards Data Science.
Want to learn Python for data science? Start today with this beginner-friendly mini-course packed with bite-sized lessons and hands-on examples.
When I was eight years old, I watched a mountaineering documentary while waiting for the cricket match to start. I remember being incredibly frustrated watching these climbers inch their way up a massive rock face, stopping every few feet to hammer what looked like giant nails into the mountain. “Why don’t they just climb faster?” […]
After seven rocky years, the company’s assets will be sold to Dazzle, a new AI firm that Mayer founded.
Ten years ago, we introduced Google’s signature four-color G to match the new look and feel of our logo. The design update reflected all the ways people interacted with …
Ten years ago, we introduced Google’s signature four-color G to match the new look and feel of our logo. The design update reflected all the ways people interacted with …
ChatGPT ads, Adobe struggles, DIY local AI, vibe coding tips, LLM course, and more...
Conceptual overview and an end-to-end Python implementation
The post Eulerian Melodies: Graph Algorithms for Music Composition appeared first on Towards Data Science.
Diraq has shown that its silicon-based quantum chips can maintain world-class accuracy even when mass-produced in semiconductor foundries. Achieving over 99% fidelity in two-qubit operations, the breakthrough clears a major hurdle toward utility-scale quantum computing. Silicon’s compatibility with existing chipmaking processes means building powerful quantum processors could become both cost-effective and scalable.
Consistent AI videos, full AI course, ChatGPT fail, language shortcut, and more...
The basics of GPU programming, optimisation, and your first Triton kernel
The post Learning Triton One Kernel At a Time: Vector Addition appeared first on Towards Data Science.
Managing AI projects is no walk in the park, but you have the power to make it easier for everyone
The post What Clients Really Ask for in AI Projects appeared first on Towards Data Science.
AI survival, Perplexity vs Google, AI threat, Gemini makeover, and more...
Generative AI has enabled the production of child sexual abuse images to skyrocket. Now the leading investigator of child exploitation in the US is experimenting with using AI to distinguish AI-generated images from material depicting real victims, according to a new government filing. The Department of Homeland Security’s Cyber Crimes Center, which investigates child exploitation…
On this special episode of Uncanny Valley recorded in front of a live audience in San Francisco, our hosts discuss Silicon Valley's history and future.
How retrieval and ensemble methods make fact-checking faster, scalable, and more reliable in a digital world
The post Building Fact-Checking Systems: Catching Repeating False Claims Before They Spread appeared first on Towards Data Science.
In this solution, we demonstrate how the user (a parent) can interact with a Strands or LangGraph agent in conversational style and get information about the immunization history and schedule of their child, inquire about the available slots, and book appointments. With some changes, AI agents can be made event-driven so that they can automatically send reminders, book appointments, and so on.