As developers increasingly lean on AI-generated code to build out their software—as they have with open source in the past—they risk introducing critical security failures along the way.
When I write about the cognitive migration now underway, brought about by the rapid advance of gen AI, I do so from the perspective of someone who has spent four decades in the technology industry. My own journey runs from coding business applications in Fortran and COBOL to systems analysis and design, IT project management, enterprise systems consulting, computing hardware sales and technology industry communications. All of it has been centered in the U.S., although I have collaborated with colleagues and clients across Europe and Asia.My writing carries an American, tech-industry vantage point, although I make attempts to see a broader perspective. Perhaps that is fitting, since much of the frontier development of AI remains clustered in Silicon Valley, Seattle, Boston and a handful of other Western hubs. But how does this migration look beyond America’s borders? For millions in the Global South, cognitive migration is less about the loss of white-collar prestige and more about the chance to leapfrog into new opportunities.This divide is visible in the data. The 2025 Edelman Trust Barometer found that fewer than one in three Americans feel comfortable with businesses using AI, while in India, Indonesia and Nigeria nearly two-thirds express comfort. In the West, AI may be perceived to threaten job loss and displacement, and this view may be warranted. A study by the International Monetary Fund (IMF) found that 60% of jobs in advanced economies are exposed to the impact of AI due to the prevalence of cognitive-task-oriented jobs. The Wall Street Journal quoted Ford CEO Jim Farley: “AI will leave a lot of white-collar people behind.”In the Global South, however, AI is often perceived as an opportunity to improve education, strengthen healthcare, modernize agriculture and drive development. One analysis argues that for the Global South, “AI holds tangible promise for nations historically excluded from the benefits of previous industrial revolutions.” Perhaps this explains the findings reported by Academia.edu that Global North newspapers publish more negative AI headlines, while Global South outlets emphasize opportunity. Yet the story is not so simple. Even where the potential for advancement is emphasized, there is often also worry about loss of work, ethics, algorithmic bias, access and technical capacity. As with earlier waves of globalization, gains and risks will be distributed unevenly.AI as opportunityThere is a strong positive narrative around AI in the Global South, with many hopeful stories and promising results. In Nigeria, a World Bank-funded after-school tutoring program that used AI to tailor lessons to individual students produced striking results with nearly two years of learning gains in just six weeks. For communities with few qualified teachers, such gains are not incremental improvements. They can transform futures. Healthcare applications provide comparable stories. In India, Boston Consulting Group reports that AI diagnostic tools are being deployed in rural clinics with few doctors, offering screenings for conditions such as breast cancer or tuberculosis that might otherwise go undetected. These tools extend the reach of limited health resources and help detect conditions before it is too late.The use of AI in agriculture also shows promise. In Kenya, the PlantVillage Nuru app developed with Penn State University uses AI to detect crop diseases through farmers’ smartphones, equipping them to spot and treat threats to their harvests early. For households that depend on subsistence farming, such tools can mean the difference between security and scarcity.Yet many of these breakthroughs rely on Northern institutions, creating benefits but also exposing a fragile dependency. When outside funding or partnerships end, local efforts can stall. In this sense, leapfrogging risks being built on borrowed foundations.Taken together, these examples illustrate why many in the Global South see AI as a chance to transform trajectories rather than repeat old patterns. Yet optimism tells only part of the story. Alongside these gains are deep structural challenges that complicate the journey, reminding us that this migration, like all others, carries benefits that include hidden costs.Barriers to progressResearch also shows that AI adoption across the Global South is hindered by persistent gaps in infrastructure, data, skills and governance. Availability of reliable electricity and broadband remains uneven, local datasets are often scarce or biased and many countries face shortages of trained professionals to develop and oversee AI systems. Without strong regulatory frameworks, societies are also more exposed to privacy risks, exploitative labor practices and algorithmic bias. These realities mean that while AI holds promise as a development pathway, it can also deepen inequality if its benefits concentrate in urban centers and among elites, while leaving rural communities behind.So why do surveys of trust show higher comfort with AI in the Global South than in the West? One explanation lies in expectations. In the U.S. and Europe, AI is often perceived as a threat to stable jobs and established professions. In Nigeria, India or Indonesia, by contrast, it is more likely to be framed as a tool for closing persistent gaps. Media narratives often reinforce the divergence in expectations. In the West, headlines emphasize automation anxiety, while in the Global South, AI is more often described as a development pathway. Add to this the fact that many people in the Global South report higher levels of trust in institutions overall, and the disparity begins to make sense. The same technology intersects with different baselines, diverse needs, distinct cultures and different stories, which shape whether AI is welcomed with suspicion or with hope. Yet beyond these perceptual differences lie material realities that complicate the optimistic narrative, particularly in how global AI development distributes both its benefits and its burdens.Hidden costsEvery migration carries costs alongside gains, and the story of AI in the Global South is no different. While the overall AI narrative in the Global South leans positive, many celebrated breakthroughs depend on large workforces doing essential yet hidden tasks. Data annotation and content review are indispensable to the global AI economy, but the work is repetitive, emotionally taxing and poorly paid relative to the value it creates.Other sectors face pressure from a different direction. In India and the Philippines, business process outsourcing and call centers employ millions of workers who support global clients. These roles depend on language, routine cognitive tasks and customer service, the very areas where AI chatbots and automated platforms are advancing fastest.The shift is not immediate, but workers in these industries are already questioning whether the migration now underway will carry them forward or leave them behind. Is cognitive migration a single global phenomenon, or are we witnessing multiple migrations that only appear connected?Many routes, shared destinationIs this the same cognitive migration unfolding everywhere, or are there separate journeys? On the surface, the story looks divided. In the U.S. and Europe, professionals worry about displacement from stable careers and a risk to their lifestyles. In India, Nigeria and Indonesia, AI is often presented as a chance to accelerate development and fill long-standing gaps. These appear to be distinct migrations.Yet, the reality is more entangled. The story of AI in the Global South is not simply one of catching up, just as the story in the West is not simply one of decline. Migration is never only progress or only loss. It is both, with something gained and something given up. For teachers in Nigeria, the gain may be students advancing at unprecedented speed. For call center workers in India, the loss may be jobs once thought secure. For farmers in Kenya, the gain may be healthier crops and steadier harvests. For professionals in Europe or the United States, the loss may be careers reshaped or diminished by automation.This variability in experience is not because AI technology is somehow different in one area or another, but because the lived experiences are diverse. The same systems can seem empowering in one place and threatening in another. An uneven passageWhat lies ahead is still uncertain. But if migration teaches anything, it is that adaptation requires not only resilience but imagination. The task is not to deny what is lost or to celebrate only what is gained, but to recognize both and design wisely for what comes next.This migration is not unfolding along a single path. It is fractured and revealing. The starting points differ, the routes are uneven, and the burdens are not equally shared. In the Global South, AI is often seen as a lever for progress, not a threat to status. But beneath the promise lie the same risks we face everywhere, including extraction without investment, automation without inclusion, innovation without safeguards and deployment without trust. These are not side effects. They are signals. If we ignore them, the cognitive future will be one more story written by the few for the few. As Indonesian policy advisor Tuhu Nugraha has argued in Modern Diplomacy: “As concerns rise globally about AI’s unchecked development potentially destabilizing economies or social cohesion, models from the Global South that emphasize inclusion, trust and reflection can help mitigate those risks before they explode into global backlash.” His warning reinforces the point that inclusion and trust must be part of the design of AI advancement and not assumed.If we pay attention, the Global South may offer not just caution but clarity. The choice is not only whether to design wisely, but whose experience we treat as essential when we do. Because in the end, cognitive migration is not regional. It is a worldwide passage, and how we navigate it together will shape not just the future of AI, but the future of being human.Gary Grossman is EVP of technology practice at Edelman.
Wearable brain, Claude defends, Hollywood fooled, no job crisis, vibe coding, and more...
Why you shouldn't overcomplicate solutions to simple problems
The post Classical Computer Vision and Perspective Transformation for Sudoku Extraction appeared first on Towards Data Science.
Practice control flow, input handling, and functions in R by creating an interactive quiz game.
The post Building a Command-Line Quiz Application in R appeared first on Towards Data Science.
In the race to automate everything – from customer service to code – AI is being heralded as a silver bullet. The narrative is seductive: AI tools that can write entire applications, streamline engineering teams and reduce the need for expensive human developers, along with hundreds of other jobs. But from my point of view as a technologist who spends every day inside real companies’ data and workflows, the hype doesn’t match up with the reality. I’ve worked with industry leaders like General Electric, The Walt Disney Company and Harvard Medical School to optimize their data and AI infrastructure, and here’s what I’ve learned: Replacing humans with AI in most jobs is still just an idea on the horizon. I worry that we're thinking too far ahead. In the past two years, more than a quarter of programming jobs have vanished. Mark Zuckerberg announced he is planning to replace many of Meta’s coders with AI. But, intriguingly, both Bill Gates and Sam Altman have publicly warned against replacing coders. Right now, we shouldn’t count on AI tools to successfully replace jobs in tech or business. That’s because what AI knows is inherently limited by what it has seen – and most of what it has seen in the tech world is boilerplate.Generative AI models are trained on large datasets, which typically fall into two main categories: publicly available data (from the open internet), or proprietary or licensed data (created in-house by the organization, or purchased from third parties). Simple tasks, like building a basic website or configuring a template app, are easy wins for generative models. But when it comes to writing the sophisticated, proprietary infrastructure code that powers companies like Google or Stripe, there’s a problem: That code doesn’t exist in public repositories. It’s locked away inside the walls of corporations, inaccessible to training data and often written by engineers with decades of experience.Right now, AI can’t reason on its own yet. And it doesn’t have instincts. It’s just mimicking patterns. A friend of mine in the tech world once described large language models (LLMs) as a "really good guesser." Think of AI today as a junior team member — helpful for a first draft or simple projects. But like any junior, it requires oversight. In programming, for example, while I’ve found a 5X improvement for simple coding, I’ve found that reviewing and correcting more complicated AI-produced code often takes more time and energy than writing the code myself. You still need senior professionals with deep experience to find the flaws, and to understand the nuances of how those flaws might pose a risk six months from now. That’s not to say AI shouldn’t have a place in the workplace. But the dream of replacing entire teams of programmers or accountants or marketers with one human and a host of AI tools is far premature. We still need senior-level people in these jobs, and we need to train people in junior-level jobs to be technically capable enough to assume the more complex roles one day. The goal of AI in tech and business shouldn’t be about removing humans from the loop. I’m not saying this because I’m scared AI will take my job. I’m saying it because I’ve seen how dangerous trusting AI too much at this stage can be. Business leaders, no matter what industry they’re in, should be aware: While AI promises cost savings and smaller teams, these efficiency gains could backfire. You might trust AI to perform more junior levels of work, but not to complete more sophisticated projects. AI is fast. Humans are smart. There’s a big difference. The sooner we shift the conversation from replacing humans to reinforcing them, the more we’ll reap the benefits of AI. Derek Chang is founding partner of Stratus Data.
Gemini redesign, Sora rules, OpenAI fintech, fake voices, DNA breach, and more...
A cycle-accurate alternative to speculation — unifying scalar, vector and matrix computeFor more than half a century, computing has relied on the Von Neumann or Harvard model. Nearly every modern chip — CPUs, GPUs and even many specialized accelerators — derives from this design. Over time, new architectures like Very Long Instruction Word (VLIW), dataflow processors and GPUs were introduced to address specific performance bottlenecks, but none offered a comprehensive alternative to the paradigm itself.
A new approach called Deterministic Execution challenges this status quo. Instead of dynamically guessing what instructions to run next, it schedules every operation with cycle-level precision, creating a predictable execution timeline. This enables a single processor to unify scalar, vector and matrix compute — handling both general-purpose and AI-intensive workloads without relying on separate accelerators.The end of guessworkIn dynamic execution, processors speculate about future instructions, dispatch work out of order and roll back when predictions are wrong. This adds complexity, wastes power and can expose security vulnerabilities. Deterministic Execution eliminates speculation entirely. Each instruction has a fixed time slot and resource allocation, ensuring it is issued at exactly the right cycle.
The mechanism behind this is a time-resource matrix: A scheduling framework that orchestrates compute, memory and control resources across time. Much like a train timetable, scalar, vector and matrix operations move across a synchronized compute fabric without pipeline stalls or contention.Why it matters for enterprise AI
Enterprise AI workloads are pushing existing architectures to their limits. GPUs deliver massive throughput but consume enormous power and struggle with memory bottlenecks. CPUs offer flexibility but lack the parallelism needed for modern inference and training. Multi-chip solutions often introduce latency, synchronization issues and software fragmentation.
In large AI workloads, datasets often cannot fit into caches, and the processor must pull them directly from DRAM or HBM. Accesses can take hundreds of cycles, leaving functional units idle and burning energy. Traditional pipelines stall on every dependency, magnifying the performance gap between theoretical and delivered throughput.
Deterministic Execution addresses these challenges in three important ways. First, it provides a unified architecture in which general-purpose processing and AI acceleration coexist on a single chip, eliminating the overhead of switching between units. Second, it delivers predictable performance through cycle-accurate execution, making it ideal for latency-sensitive applications such as large langauge model (LLM) inference, fraud detection and industrial automation. Finally, it reduces power consumption and physical footprint by simplifying control logic, which in turn translates to a smaller die area and lower energy use.
By predicting exactly when data will arrive — whether in 10 cycles or 200 — Deterministic Execution can slot dependent instructions into the right future cycle. This turns latency from a hazard into a schedulable event, keeping the execution units fully utilized and avoiding the massive thread and buffer overheads used by GPUs or custom VLIW chips. In modeled workloads, this unified design delivers sustained throughput on par with accelerator-class hardware while running general-purpose code, enabling a single processor to fulfill roles typically split between a CPU and a GPU.
For LLM deployment teams, this means inference servers can be tuned with precise performance guarantees. For data infrastructure managers, it offers a single compute target that scales from edge devices to cloud racks without major software rewrites.Comparison of traditional Von Neumann architecture and unified deterministic execution. Image created by author.Key architectural innovations
Deterministic Execution builds on several enabling techniques. The time-resource matrix orchestrates compute and memory resources in fixed time slots. Phantom registers allow pipelining beyond the limits of the physical register file. Vector data buffers and extended vector register sets make it possible to scale parallel processing for AI operations. Instruction replay buffers manage variable-latency events predictably, without relying on speculation.
The architecture’s dual-banked register file doubles read/write capacity without the penalty of more ports. Direct queuing from DRAM into the vector load/store buffer halves memory accesses and removes the need for multi-megabyte SRAM buffers — cutting silicon area, cost and power.
In modeled AI and DSP kernels, conventional designs issue a load, wait for it to return, then proceed — causing the entire pipeline to idle. Deterministic Execution pipelines loads and dependent computations in parallel, allowing the same loop to run without interruption, cutting both execution time and joules per operation.
Together, these innovations create a compute engine that combines the flexibility of a CPU with the sustained throughput of an accelerator, without requiring two separate chips.Implications beyond AI
While AI workloads are an obvious beneficiary, Deterministic Execution has broad implications for other domains. Safety-critical systems — such as those in automotive, aerospace and medical devices — can benefit from deterministic timing guarantees. Real-time analytics systems in finance and operations gain the ability to operate without jitter. Edge computing platforms, where every watt of power matters, can operate more efficiently.
By eliminating guesswork and enforcing predictable timing, systems built on this approach become easier to verify, more secure and more energy-efficient.Enterprise impact
For enterprises deploying AI at scale, architectural efficiency translates directly into competitive advantage. Predictable, latency-free execution simplifies capacity planning for LLM inference clusters, ensuring consistent response times even under peak loads. Lower power consumption and reduced silicon footprint cut operational expenses, especially in large data centers where cooling and energy costs dominate budgets. In edge environments, the ability to run diverse workloads on one chip reduces hardware SKUs, shortens deployment timelines and minimizes maintenance complexity.A path forward for enterprise computing
The shift to Deterministic Execution is not merely about raw performance; it represents a return to architectural simplicity, where one chip can serve multiple roles without compromise. As AI permeates every sector, from manufacturing to cybersecurity, the ability to run diverse workloads predictably on a single architecture will be a strategic advantage.
Enterprises evaluating infrastructure for the next five to 10 years should watch this development closely. Deterministic Execution has the potential to reduce hardware complexity, cut power costs and simplify software deployment — while enabling consistent performance across a wide range of applications.
Thang Minh Tran is a microprocessor architect and inventor of more than 180 patents in CPU and accelerator design.
Once upon a time, handling streaming data was considered an avant-garde approach. Since the introduction of relational database management systems in the 1970s and traditional data warehousing systems in the late 1980s, all data workloads began and ended with the so-called batch processing. Batch processing relies on the concept of collecting numerous tasks in a group (or batch) […]
The post Real-Time Intelligence in Microsoft Fabric: The Ultimate Guide appeared first on Towards Data Science.
Learn how to access vasts amounts of information with your own deep research system
The post How to Build a Powerful Deep Research System appeared first on Towards Data Science.
HydroSpread, a breakthrough fabrication method, lets scientists build ultrathin soft robots directly on water. These tiny, insect-inspired machines could transform robotics, healthcare, and environmental monitoring.
Brain AI, Sora is #1, free Comet, creepy robots, auth for AI agents, and more...
Organizations are increasingly integrating generative AI capabilities into their applications to enhance customer experiences, streamline operations, and drive innovation. As generative AI workloads continue to grow in scale and importance, organizations face new challenges in maintaining consistent performance, reliability, and availability of their AI-powered applications. Customers are looking to scale their AI inference workloads across […]
Huawei’s Computing Systems Lab in Zurich has introduced a new open-source quantization method for large language models (LLMs) aimed at reducing memory demands without sacrificing output quality. The technique, called SINQ (Sinkhorn-Normalized Quantization), is designed to be fast, calibration-free, and easy to integrate into existing model workflows. The code for performing it has been made available by the Huawei research team on Github and Hugging Face under a permissive, enterprise-friendly Apache 2.0 license, allowing organizations to take and use it, modify it, and deploy it commercially — all for free.Across models of different sizes, SINQ cuts memory usage by 60–70%, depending on architecture and bit-width. This enables models that would previously require >60 GB of memory to run on ~20 GB setups—a critical enabler for running large models on a single high-end GPU or even multi-GPU consumer-grade setups.This makes it possible to run models that previously needed high-end enterprise GPUs—like NVIDIA’s A100 or H100—on significantly more affordable hardware, such as a single Nvidia GeForce RTX 4090 (around $1600), instead of enterprise hardware like the A100 80GB ($19,000) or even H100 units that exceed $30,000.For teams using cloud infrastructure, the savings are similarly tangible. A100-based instances often cost $3–4.50 per hour, while 24 GB GPUs like the RTX 4090 are available on many platforms for $1–1.50 per hour. Over time, especially for extended inference workloads, this difference can add up to thousands of dollars in cost reductions, while also unlocking LLM deployment on smaller clusters, local workstations, or consumer-grade setups previously constrained by memory.Tackling the Memory Challenge of LLMsRunning large models often requires compromises between performance and size. In practice, neural networks use floating-point numbers to represent both weights and activations. A floating-point number can express a wide range of values (very small, very large, with fractional parts).This flexibility is helpful because during training and inference, weights and activations can vary in scale dramatically. Using floating-point lets the model adjust precisely. (For example, a weight could be 0.0023 or 123.45, and floating-point can capture both with decent precision.)Quantization — a method that reduces the precision of model weights — offers a practical path to lower memory usage, but typically comes with trade-offs in model quality, especially at 4-bit precision and below.When you convert those floating-point values into lower-precision formats (like 8-bit integers), you’re approximating them. That means you store and compute with fewer bits, which is faster and more memory-efficient — but you risk losing fidelity (i.e. introducing small errors). The trick is to do the conversion carefully so the model’s behavior stays nearly the same, even though internally it’s working with rougher approximations of those weights and activations.SINQ addresses these pain points by introducing a plug-and-play solution that delivers strong performance even in low-precision settings—without requiring calibration data or inter-layer dependencies.How SINQ WorksThe SINQ approach introduces two main innovations:Dual-Axis Scaling: Instead of using a single scale factor for quantizing a matrix, SINQ uses separate scaling vectors for rows and columns. This helps mitigate the effects of outliers and allows the quantization error to be distributed more flexibly across the matrix.Sinkhorn-Knopp-Style Normalization: A fast algorithm inspired by Sinkhorn iterations is used to normalize the standard deviations of rows and columns in a matrix. This helps minimize what the authors call “matrix imbalance,” a new proxy metric shown to be more effective than alternatives like kurtosis for improving quantization performance.The combination of these two features allows SINQ to outperform other calibration-free techniques such as Round-To-Nearest (RTN), HQQ, and Hadamard-based quantization across multiple benchmarks.Performance and CompatibilitySINQ has been evaluated across a wide range of architectures and models, including the Qwen3 series, LLaMA, and DeepSeek. On benchmarks like WikiText2 and C4, SINQ consistently reduces perplexity and flip rates compared to baseline methods, often approaching or matching the performance of calibrated solutions.It also supports non-uniform quantization schemes such as NF4 and can be combined with calibration methods like AWQ, leading to the variant A-SINQ. In calibrated settings, A-SINQ further narrows the gap with full-precision models.In terms of runtime efficiency, SINQ quantizes models roughly twice as fast as HQQ and over 30 times faster than AWQ. This makes it well-suited for both research and production environments where quantization time is a practical constraint.Open Source and Easy to UseHuawei has released SINQ as an open-source project under a permissive, enterprise-friendly Apache 2.0 license, with implementation instructions and reproducibility tools available on GitHub:The repository includes support for quantizing Hugging Face models with just a few lines of code, as well as tools for saving and reloading quantized weights. Default settings offer a balance between memory savings and accuracy, and users can customize parameters like bit-width, tiling strategy, and group size based on their needs.The authors also provide evaluation integration via the lm-eval library and plan to release pre-quantized models on the Hugging Face Hub in the near future.Looking AheadWith growing demand for running large models on consumer-grade hardware, quantization is becoming an essential tool. SINQ aims to lower the entry barrier for LLM deployment, enabling developers and researchers to efficiently shrink models without major trade-offs in quality or compatibility.Further updates—including integration with Hugging Face Transformers and pre-quantized model releases—are planned, making this a project to watch in the quantization space.
In this post, we demonstrate how to access AgentCore Gateway through a VPC interface endpoint from an Amazon Elastic Compute Cloud (Amazon EC2) instance in a VPC. We also show how to configure your VPC endpoint policy to provide secure access to the AgentCore Gateway while maintaining the principle of least privilege access.
OpenAI will host more than 1,500 developers at its largest annual conference on Monday, as the company behind ChatGPT seeks to maintain its edge in an increasingly competitive artificial intelligence landscape.The third annual DevDay conference at San Francisco's Fort Mason represents a critical moment for OpenAI, which has seen its dominance challenged by rapid advances from Google's Gemini, Anthropic's Claude, and Meta's growing AI efforts. The event comes just days after OpenAI's new Sora video generation app topped Apple's App Store, demonstrating the company's ability to capture mainstream attention even as technical competitors close the gap.Chief Executive Sam Altman will deliver the opening keynote at 10 a.m. Pacific time, promising "announcements, live demos, and a vision of how developers are reshaping the future with AI." The session will be livestreamed, but subsequent presentations — including a developer-focused "State of the Union" with President Greg Brockman and a closing conversation between Altman and Apple design legend Jony Ive — will only be available to in-person attendees.Google and Meta challenge ChatGPT's developer dominanceThe conference arrives at a pivotal moment for OpenAI. While the company's ChatGPT remains the most recognizable AI brand globally, technical evaluations show Google's latest Gemini models performing competitively on coding tasks, while Anthropic's Claude has gained traction among developers for its safety features and reasoning capabilities.This intensifying competition has fundamentally altered OpenAI's strategic calculus. The company that once commanded premium pricing for access to its models now finds itself in a price war, releasing more capable systems at lower costs to retain developer loyalty.The shift reflects a maturing market where technical performance differences between leading AI models have narrowed considerably, forcing companies to compete on price, developer experience, and specialized capabilities rather than raw model superiority.The timing of DevDay also follows several strategic moves by OpenAI that signal broader ambitions beyond its core chatbot business. The company recently launched Sora 2, its advanced video generation model, alongside a social media application that allows users to create and share AI-generated videos. Industry observers speculate that Monday's event could feature the long-rumored ChatGPT browser, potentially challenging Google Chrome's dominance.Enterprise AI adoption takes center stage as revenue strategy shiftsThis year's agenda reflects OpenAI's growing focus on enterprise customers, who provide more predictable revenue streams than consumer subscriptions. Sessions will cover "orchestrating agents at scale," enterprise AI adoption challenges, and how OpenAI applies its own technology to internal workflows across sales, support, and finance.The enterprise emphasis marks a shift from earlier DevDay events. The inaugural 2023 conference introduced GPT-4 Turbo and the GPT Store marketplace, while 2024's more subdued gathering focused primarily on developer API improvements. This year's expanded format suggests OpenAI views the developer community as crucial to its competitive positioning.The State of the Union presentation is expected to focus on how artificial intelligence is transforming software development workflows, with anticipated demonstrations of enhanced capabilities in OpenAI's Codex programming assistant and the introduction of new open model offerings that could expand developer access to the company's technology.Sora cinema and interactive AI demos showcase next-generation capabilitiesBeyond formal presentations, DevDay will feature hands-on demonstrations of emerging technologies. A "Sora Cinema" will showcase AI-generated short films, while custom arcade games built using GPT-5 — OpenAI's latest model — will demonstrate the technology's creative applications.Perhaps most intriguingly, attendees can interact with a "living portrait" of computer science pioneer Alan Turing that responds to questions, representing the kind of interactive AI experiences that could define the next generation of human-computer interaction.The presence of Jony Ive at the closing session carries particular significance. The former Apple executive has been collaborating with OpenAI on a consumer AI device, suggesting Monday's conversation could provide insights into the company's hardware ambitions.Developer ecosystem and market positioning face unprecedented competitive pressureFor enterprise technology decision-makers, DevDay represents more than a product showcase — it's a window into how AI will reshape software development and business processes. The conference agenda includes sessions on context engineering, agent orchestration, and enterprise scaling challenges that reflect real-world implementation hurdles.The developer ecosystem around OpenAI's APIs has become a critical competitive moat. Companies like Cursor, Clay, and Decagon have built substantial businesses on OpenAI's foundation models, creating network effects that make switching to alternative providers more difficult.However, this ecosystem faces new challenges as competitors offer compelling alternatives. Google's recent improvements to Gemini for coding tasks and Meta's investments in its Superintelligence Labs represent serious threats to OpenAI's developer mindshare.As the AI industry matures beyond initial breakthroughs, Monday's DevDay will test whether OpenAI can maintain its leadership position through superior tooling, developer experience, and enterprise-focused innovation. With over $500 billion in market valuation riding on continued growth, the stakes for this year's conference extend far beyond San Francisco's shores.The keynote begins at 10 a.m. Pacific time and will be available via livestream on OpenAI's YouTube channel.
MIT CSAIL and McMaster researchers used a generative AI model to reveal how a narrow-spectrum antibiotic attacks disease-causing bacteria, speeding up a process that normally takes years.
OpenAI's CEO explains that its large language model has been misunderstood—and that he's changed his attitude to AGI.
Imbalanced datasets are a common challenge in machine learning.
This article critically explores both perspectives, weighing the benefits, drawbacks, and future potential of Study Mode to determine whether it lives up to the hype.
A framework-free guide for Python programmers
The post Build a Data Dashboard Using HTML, CSS, and JavaScript appeared first on Towards Data Science.
Understanding and implementing MobileNetV2 with PyTorch — the next generation of MobileNetV1
The post MobileNetV2 Paper Walkthrough: The Smarter Tiny Giant appeared first on Towards Data Science.
Build these AI agents that actually do useful work (and teach you a bunch).
Browser brains, Minecraft AI, King OpenAI, Apple AI glasses, and more...
A new study by Shanghai Jiao Tong University and SII Generative AI Research Lab (GAIR) shows that training large language models (LLMs) for complex, autonomous tasks does not require massive datasets. Their framework, LIMI (Less Is More for Intelligent Agency), builds on similar work in other areas of LLM research and finds that “machine autonomy emerges not from data abundance but from strategic curation of high-quality agentic demonstrations.” In other words, it's data quality, not quantity, that matters. In experiments, the researchers found that with a small, but carefully curated, dataset of just 78 examples, they could train LLMs to outperform models trained on thousands of examples by a considerable margin on key industry benchmarks. This discovery could have important implications for enterprise applications where data is scarce or expensive to collect.The challenge of building agents that workThe researchers define agency as “the emergent capacity of AI systems to function as autonomous agents–actively discovering problems, formulating hypotheses, and executing solutions through self-directed engagement with environments and tools.” In other words, these are AI systems that “don’t just think, but work.” The problem is that current training frameworks assume that higher agentic intelligence requires a lot of data, as has been shown in the classic scaling laws of language modeling. The researchers argue that this approach leads to increasingly complex training pipelines and substantial resource requirements. Moreover, in many areas, data is not abundant, hard to obtain, and very expensive to curate.However, research in other domains suggests that you don’t necessarily require more data to achieve training objectives in LLM training. For example, LIMA, a 2023 paper, showed a model could be effectively aligned with just 1,000 curated examples. More recently, LIMO demonstrated that complex mathematical reasoning could emerge from only 817 training samples. With LIMI, the researchers sought to apply the same “less is more” principle to the complex world of AI agents.How LIMI worksThe LIMI framework demonstrates that sophisticated agentic intelligence can emerge from minimal but strategically curated demonstrations of autonomous behavior. Key to the framework is a pipeline for collecting high-quality demonstrations of agentic tasks. Each demonstration consists of two parts: a query and a trajectory. A query is a natural language request from a user, such as a software development requirement or a scientific research goal. The trajectory is the series of steps the AI takes to address the query, including its internal reasoning, its calls to external tools like a code interpreter, and the observations it receives from the environment. For example, a query might be "build a simple chat application," and the trajectory would include the agent’s internal reasoning and action plan, the code it writes and executes, and the resulting output or errors. The trajectory could include multiple iterations of planning, execution, and reflection until it achieves the desired objective.To build their dataset, the researchers started with 60 queries from real-world scenarios faced by professional developers and researchers. They then expanded this pool by using GPT-5 to synthesize additional queries from GitHub Pull Requests. They employed a team of four computer science PhD students to vet the quality of these queries and choose 18 examples to create a high-quality set of 78 queries focused on software development and research workflows.To generate the trajectories, the same PhD students collaborated with a CLI coding agent powered by GPT-5 to complete the 78 tasks. They followed an iterative process, collecting the entire interaction sequence until each task was successfully completed, capturing the full arc of realistic human-AI collaboration, including back-and-forth communication and iterative refinement. For the more complex queries, the collected trajectories could extend to more than 152,000 tokens.“This approach guarantees that our models learn not only from successful outcomes but also from the complete problem-solving process, including how to adapt strategies and recover from failures during collaborative execution,” the researchers write.LIMI in actionTo test their framework, the team evaluated models on AgencyBench, a benchmark designed for measuring agentic skills, as well as other established benchmarks for tool use and coding. They fine-tuned GLM-4.5, a powerful open-source model, using their 78-sample dataset and compared its performance against several frontier models, including the base GLM-4.5, Kimi-K2-Instruct, and DeepSeek-V3.1. The LIMI-trained model achieved an average score of 73.5% on AgencyBench, significantly outperforming all baseline models, the best of which (GLM-4.5) scored 45.1%. This superiority extended to other benchmarks covering tool use, coding, and scientific computing, where LIMI also outperformed all baselines.More importantly, the study showed that the model trained on just 78 examples outperformed models trained with 10,000 samples from another dataset, delivering superior performance with 128 times less data. “This discovery fundamentally reshapes how we develop autonomous AI systems, suggesting that mastering agency requires understanding its essence, not scaling training data,” the researchers write. “As industries transition from thinking AI to working AI, LIMI provides a paradigm for sustainable cultivation of truly agentic intelligence.”The researchers have released the code for the data synthesis and training and model weights. For the enterprise, this approach offers a practical path toward developing highly specialized AI agents. Instead of undertaking massive data collection projects, organizations can leverage their in-house talent and subject matter experts to create small, high-quality datasets for bespoke agentic tasks. This lowers the barrier to entry and enables businesses to build custom AI agents that can provide a competitive edge on the workflows that matter most to them.
In this post, we demonstrate how organizations can enhance their employee productivity by integrating Kore.ai’s AI for Work platform with Amazon Q Business. We show how to configure AI for Work as a data accessor for Amazon Q index for independent software vendors (ISVs), so employees can search enterprise knowledge and execute end-to-end agentic workflows involving search, reasoning, actions, and content generation.
Today, we’re excited to announce the Amazon Bedrock AgentCore Model Context Protocol (MCP) Server. With built-in support for runtime, gateway integration, identity management, and agent memory, the AgentCore MCP Server is purpose-built to speed up creation of components compatible with Bedrock AgentCore. You can use the AgentCore MCP server for rapid prototyping, production AI solutions, […]
Google wants its coding assistant, Jules, to be far more integrated into developers’ terminals than ever. The company wants to make it a more workflow-native tool, hoping that more people will use it beyond the chat interface. Jules, which the company first announced in December 2024, will gain two new features: a Jules API to facilitate integration with IDEs and a Jules Tools CLI, allowing the agent to be opened directly on the command line. More companies find that bringing their agents, coding-focused or not, into the applications people removes a lot of friction for enterprise users. Jules takes this trend a step further by adopting the same workflow as developers. “Until today, you’ve primarily interacted with Jules in your web browser, but we know developers live in the terminal,” said Kathy Korevec, director of product at Google Labs, in a blog post. “It’s where we test, build, debug, and ship. That’s why we built Jules Tools, a lightweight command line interface, so you can spin up tasks, inspect what Jules is doing, and make the agent your own, all without leaving your workflow.”Through Jules CLI and API, Google said enterprises will get “more control and flexibility by where and how you can use Jules.”In May, Google released Jules into beta and announced GitHub integration. It became generally available in August with higher rate limits for Google AI Pro and Ultra users. Moving away from chatAs Korevec said, the primary way developers and enterprises interact with agents is through a chat interface. Other coding agents have begun integrating their coding tools into IDEs and CLIs. OpenAI rolled out a fine-tuned version of GPT-5, called GPT-5-Codex, that will initiate the process of unifying its Codex coding assistant with IDEs, CLIs, and other workflows. Google also released Gemini CLI, which acts similarly to Jules Tools CLI, but is open-sourced and can be brought to other platforms. Coding agents are already becoming essential tools for enterprises, and as their preferences solidify, providers like Google, OpenAI and Anthropic want these agents to be more top of mind for users. Enterprises envision AI agents to be more passive, ambient tools, which can be difficult to imagine if most people interact with them via chat and have to prompt them to work. If adoption is high, that more proactive and integrated future will be clearer. How Tools and API workDevelopers can install Jules Tools via npm, which will then print a guide on how to use it. While in the CLI, an engineer or coder can use the code Command to prompt Jules to do a task, and the code Flag will customize it. For example, this string, jules --theme light, will switch to light mode. On the API side, enterprises can connect the Jules API to other platforms they use. They can connect it to Slack, for example, so that some team members can trigger tasks directly from Slack if a bug is reported there, which will then tap their CI/CD pipeline. Google also added other updates to help reduce latency and fix some environment and file system issues. These include:File selector to call out specific files in chat to add contextMemory, which gives Jules the ability to remember preferences going forwardEnvironment Variables management that gives Jules access to these variables while executing a taskResponse so farSince its announcement, the response has been mostly positive. However, some are confused over Google’s two coding agent CLI offerings.
Google is announcing a new $4 billion investment in Arkansas through 2027, which will include Google’s first data center in the state — located in West Memphis — along w…
How do platform firms set prices and make money?
The post Prediction vs. Search Models: What Data Scientists Are Missing appeared first on Towards Data Science.