The Power of the Learning Layer – AI Tech Stacks in Education
Upcoming events, Anthropic's Claude 3.5 Sonnet, an original interview with Stephen Jull, SchoolLinks $80 million Series B, and more!
🚨 Follow us on LinkedIn to be the first to know about new events and content! 🚨
The Power of the Learning Layer – AI Tech Stacks in Education
By Alex Sarlin
“The future of the modern AI stack is being decided now. More than ever, machines are capable of reasoning, creation, and creativity, and these new capabilities are driving enterprises to reconstruct their tech stacks. While the early days of this AI transformation felt like the Wild West, today, builders are converging around infrastructure, tooling, and approach.” - Menlo Ventures, The Modern AI Stack
As the world wraps its head around Generative AI, enthusiasts across sectors are working to clarify and visualize the future of AI tech stacks. Instead of using the term ‘stack’, the term 'layer' is emerging, as technical diagrams for these systems usually look like a layer cake or a pyramid with a clear order, top, and bottom:
Most AI layer diagrams, like those shown above, include the following layers from bottom to top:
Infrastructure Layer: Underlying AI chips, servers, and sometimes training data.
Model Layer: Large Language Models like GPT, Gemini, Claude, Llama, Mistral, but this could also include smaller models.
Application Layer: Specific applications built on top of LLMs for end users.
User Layer: While not commonly included now, I would suggest adding a User Layer to these diagrams, referring to the additional data and instructions contributed by individual users to generate the outputs they need.
In all these diagrams, layers near the bottom are general and shared across many applications and use cases. Layers near the top are tailored for specific purposes and end users. Each added layer tailors the output of the layers below it. This plays out in one of two ways:
An added layer provides new, local data (ideally data that would not be available at lower layers).
An added layer provides more instructions that configure what the technology should or shouldn’t do. This is also known as configuration, tuning, or fine tuning.
Layers in the Real World
To take this out of theory and into real world application, here’s a common educational use case mapped out in AI layers that would be relevant today.
We’re going to turn that pyramid on its side to remove the hierarchical feel, but it is important to remember that the infrastructure layer and model layer here are used for billions of applications, and the user layer is asking for a single request at a time.
Note that the layers in green are the two that are tailored to create an educational use case for the AI, which means that educational application creators and users are left alone to battle any ‘non-educational’ tendencies of the underlying model.
User Layer: A high school teacher enters the local standard they are working with and a reading they have chosen (new data) and asks for a customized lesson plan that should work for a 45 minute period (gives instructions).
Application Layer: This request triggers an education-tailored application (like Magic School) to use its own internal data (example lesson plans, pedagogical research, etc.) and instructions (the configurations the application has made for its end users) as part of the ‘request’ to the underlying model.
Model Layer: These requests and instructions make an API call to a large language model, which has been trained on vast amounts of data (the public internet and beyond) and customized with thousands of additional instructions (be helpful, stay positive, avoid cursing, never use epithets, don’t provide personal data, etc.). Note that it is rarely clear which model any specific educational application is addressing, and there might even be more than one in the stack; some of the more sophisticated ones will ask small language models (which are much cheaper to run) to address simpler queries and large language models for more complex queries.
Infrastructure Layer: Finally, the model runs its calculations on the infrastructure layer, which is basically the chips and servers which underlie LLMs.
Multi-Directional Processing
For any particular request, these layers run in both directions: the user’s request is run down through the layers, and the output comes back up through the layers. There can technically be changes, checks, and constraints at any point in this process.
For example, in our use case above, any layer can be used to change a user’s request into model-friendly language on the way in… AND it can be used to assess the quality and appropriateness of the model’s answer on the way out before returning it to the user.
Challenges with AI Layers in Education
The big advantage of education-specific layers are that they are trained and customized specifically for education use cases, and can deliver simple user interfaces that allow users to get high-quality outputs with very little training. But there are disadvantages as well:
AI learning layers are technically unnecessary, as good user prompting can theoretically get you a good output. This is the origin of complaints about educational applications that are simply ‘wrappers’ of the model layer.
Any set of AI layers must be more effective than an average set of a few prompts. At Edtech Week, GSV’s Claire Zau referred to this as the “is your application better than seven informed prompts to ChatGPT” test (slightly paraphrased).
General purpose AI layers are typically designed to offer information to a user quickly, easily, and effectively, which is often at odds with what education use cases actually need. That means that education companies must often wrestle with foundational layers.
For student-facing applications, app developers also have to try to infuse a new personality into the model. LLMs are designed to be as general purpose as possible, so the default personality they display in chat interfaces is trained to be as helpful, unchallenging, and subservient as possible. Think more C-3PO, less Mr. Chips… Google researchers call this ‘sycophancy’.
But we know that what makes educators effective is often the exact opposite of servility and politeness: the ability to challenge, surprise, share passion, provoke, inspire, emotionally connect. To identify what’s unique about individual students and help them discover what really motivates them. In short, to care.
To date, every edtech company building the application layer for education has benefited from the underlying model, but has also had to wrestle with many of the model’s core instructions (as Kristen DiCerbo of Khan Academy told us on the podcast: “(GPT 4) really wants to give the answer”), as well as its default simulated personality.
Moreover, each company has to wrestle with the underlying layers on their own, and starting from scratch. This often looks like providing as many instructions and datapoints as possible to try to tune the model’s behavior to be educationally effective (not to mention age-appropriate, privacy sensitive, and politically neutral… sheesh!).
The result is that when education layers are only mildly customized, there can be an eerie similarity in multiple applications’ results, because the outputs from multiple tools are basically coming out unchanged from the same underlying models. When education layers are highly opinionated and customized, the quality of the answer is entirely contingent on the quality of the education layer’s training and instructions, which can be variable. Educators or students who try a few such applications and walk away uninspired may be tempted to turn their backs on the whole sector.
But there is a new layer coming…
Layering in the Learning Model
Several industries have begun building domain-specific LLMs trained on specific data and instructions tied to the needs of specific industries (see ClimateBERT for climate or Med-PaLM for medicine). What would it take to build an education-specific model (let’s call in the ‘learning model’) that could help power the entire educational application layer?
Well, technically the ‘learning model’ layer could be created by either neighboring layer. A major edtech application provider could create an educationally trained large language model that is then made available to other applications with API calls. But… this isn’t too likely. LLM training and maintenance is incredibly expensive. The cost of access to the underlying infrastructure (those darn chips!) alone is out of reach for most edtech companies.
Instead, it’s the creators of the foundational LLMs that are the best candidates for developing the learning model layer. They are by far the best resourced in the space, have access to mind-blowing amounts of data, computing power, and world-class expertise, and have the ability to collaborate with high-caliber researchers. Recently, some of the biggest ones have already begun to turn their attention to the mismatch between LLM behavior and educational use cases, and started building education specific models, based on their original models but tuned and enhanced with new data and instructions.
Let’s look at some of the projects in flight.
Google’s LearnLM
This May at Learn i/o, Google announced “LearnLM”, a “new family of models fine-tuned for learning, based on Gemini.” They put out a technical report about the training of the models (with 74 authors!), and have built applications on top of it, such as Learn About and NotebookLM, along with Youtube and Google Classroom integration. In June, Claire Zau wrote a terrific breakdown of the Google LearnLM paper (and other educational models) here.
While Google’s announcement caused some ripples in edtech circles at the time, the recent rapid growth and excitement around NotebookLM (and its signature “turn any content into a discussion podcast” feature) is an early demonstration of the power of the learning layer to fuel product innovation. If history serves, Google is likely to have additional products and features up its sleeve. Perhaps LearnLM will even have APIs of its own for third party developers?
The Chan Zuckerberg Initiative
The Chan Zuckerberg Initiative (CZI), the philanthropic organization started by Dr. Priscilla Chan and Meta founder Mark Zuckerberg, is currently exploring what AI infrastructure for education should exist and be open and accessible to all to enable impactful products that help teachers bring what we know from the science of learning into classroom practice. In partnership with educators and researchers, CZI is working to improve AI systems with:
High-quality education knowledge like curriculum, standards, and research from the science of learning.
Auto evaluators of AI-generated content. They are developing evaluation rubrics with organizations like The Achievement Network (ANet) and are in the early phases of piloting a few tools.
According to Sandra Liu Huang, CZI Head of Education and VP of Product:
“For nearly a decade, CZI has helped to advance research and tools to make the science about how students learn and develop available to all educators. We're excited our technology is in a new phase that is focused on building needed AI infrastructure to make it easier for edtech developers to build tools grounded in the science of learning.”
Microsoft Phi-3
In April, Microsoft announced Phi, a family of small language models (SLMs) trained on high quality, but simplified stories or educational materials like textbooks. The Phi models are designed primarily to expand access to GenAI by allowing models to run on devices, but also to optimize certain properties of learning, such as tuning outputs to the age and literacy level of learners. To my eye, there hasn’t been much buzz about Phi-3 in the education community, but the concept of SLMs for education is one that is likely to influence the field going forward.
Anthropic
Anthropic, to date, has seemed to take a different approach to the idea of ‘education-focused LLMs.’ They’ve created their most advanced student model (Claude Opus), which actively works to outperform other models on most tasks that resemble undergraduate content work, as well as tried to understand the inner workings of the Claude models to identify interpretable feature neighborhoods that define its ‘thinking’. The results are interesting ‘mindmaps’ that could be used to generate learning tools in the future, especially when and if Anthropic moves more assertively into the education market.
OpenAI
As far as we know, OpenAI has not yet released a model that is tailored to educational use cases. Its “ChatGPT for Edu” product is primarily based on licensing 4o, its core commercial LLM, to educational entities rather than building a model tuned for the education use case. OpenAI also launched a new model (o1) in September that, like Claude Opus, is designed to act like a super-student, and specifically to be metacognitive about its own reasoning and strategic thinking.
According to OpenAI’s announcement of the o1 model:
“We trained these models to spend more time thinking through problems before they respond, much like a person would. Through training, they learn to refine their thinking process, try different strategies, and recognize their mistakes… In our tests, the next model update performs similarly to PhD students on challenging benchmark tasks in physics, chemistry, and biology.”
Conclusion
The insertion of a ‘learning model layer’ could lead to a sea change in the edtech space, especially if the learning layer is made available to edtech providers.
Potential value additions of this layer to our sector include:
Raising the Bar on Pedagogical Quality: It would, at least in theory, raise pedagogical quality across the board and take the burden off of edtech companies to have to tailor LLMs to their particular use cases. The learning model would be trained on all the learning theory and instructional design you can imagine; it would not only know its UDL from PBL from IBL, but it should be able to act much more like a true educator.
Interoperability: The learning model could allow application layers to connect diverse data sources across platforms, tools, and programs for both student and teachers.
Data insights: The learning model layer would allow users to analyze and query data using common language prompts and searches and generate instant visualizations and representations.
Dynamic assessment: The learning model the ability to move from high-stakes summative assessment to a more formative, continuous, stealth assessment, per the work of Daniel Schwartz and Valerie Shute.
Acceleration of Edtech Businesses: Without the need to tailor model behavior, AI-focused Edtech companies could spend more time on building and selling their products and serving the particular needs of their users.
Evolution Over Time: Any new dataset, especially those around efficacy results or student outcomes, could be used to finetune and tailor the model further; we could immediately implement ‘what’s working’ across the board.
Of course, relying on the biggest technology players, most of which are for-profit, to support a common good is not without risk; it raises big questions (even dilemmas) around who gets to decide what happens in the layer, in areas like:
Cost: How much a learning model provider would charge companies to access it
Access: Would a learning model layer be accessible to all, or to select tools?
Default Pedagogical Model: Without additional training, the layer would need to have a default ‘personality’ and pedagogical stance. So what exactly does the ‘perfect teacher' look like, anyway?
Resolving Pedagogical Disputes: Education experts love to argue; advocates of constructivism and direct instruction (for one of many examples) have been at odds for many years based on competing research and theory; so where does LearnLM land on this? Who decides?
This edition of the Edtech Insiders Newsletter is sponsored by Tuck Advisors.
Since 2017, Tuck Advisors has been a trusted name in Education M&A. Founded by serial entrepreneurs with over 25 years of experience founding, investing in, and selling companies, we believe you deserve M&A advisors who work as hard as you do. Are you a founder thinking of selling your company? Have you received any UFO’s (unsolicited flattering offers) and need help determining if they’re real or hoaxes? We can help! Reach out to us at confidential@tuckadvisors.com.
Edtech Insiders Upcoming Events
Let’s rethink how impact is measured and achieved in EdTech.
Expand on RCTs: Explore richer, multidimensional evidence.
Context-Driven Rigor: Align research with real-world applications.
Overcome Obstacles: Exchange insights and solutions with peers.
Featuring:
Natalia I. Kucirkova – EdTech researcher and Co-Founder and Director at International Centre for EdTech Impact; WiKIT
David Dockterman – Expert on evidence-based practices and Lecturer at Harvard University Graduate School of Education
Walk away with fresh strategies and actionable ideas to enhance your organization’s impact!
Bay Area Edtech Happy Hour
Our next Bay Area Edtech Happy Hour will be held at Common Sense Media Lounge in San Francisco on November 13! Join Edtech Insiders and our sponsors and partners keen research, Cooley, Tuck Advisors, Common Sense Media, and FOHE for an evening to connect, collaborate, and enjoy a drink together!
We’d love to see you there! RSVP if you plan to attend, as space is limited.
Cooley LLP is the go-to law firm for edtech innovators, from early childhood through workforce. Informed by decades of experience in the education vertical, Cooley created the first edtech practice to provide industry-informed, business-minded counsel to companies and organizations at all stages of the corporate lifecycle. Cooley provides a multidisciplinary approach to client needs, offering seamless collaboration across offices and practices.
To learn more about what Cooley can do for you, reach out to Naomi May.
Top Edtech Headlines
1. Anthropic’s Claude 3.5 Sonnet Can Use a Computer
Anthropic’s latest AI model, Claude 3.5 Sonnet, introduces a breakthrough capability that allows it to interact with computers in a human-like way. Through a new "computer use" API, Claude can navigate screens, click buttons, and type text, aimed at tasks like form-filling or interface testing. While promising, this feature is still experimental, and Anthropic advises users to be mindful of potential errors as it refines the technology to ensure secure, ethical application in various fields, including accessibility and education.
2. Physics Wallah's Raises $210 Million Series B
Physics Wallah (PW), an Indian edtech startup, has secured $210 million in Series B funding, raising its valuation to $2.8 billion. This investment, led by Hornbill Capital, positions PW as India's third-largest edtech firm, with plans to expand its offerings and enhance its technology. PW’s rapid growth highlights its impact on accessible education in India, aiming to maintain profitability while pursuing strategic partnerships and technology advancements.
3. The World Needs a ‘Premortem’ on GenAI in Education
The Brookings Institution, a renowned public policy think tank, calls for a "premortem" on the potential risks generative AI may introduce in education. This proactive approach aims to identify and mitigate AI’s impacts on teaching methods, student learning, and ethical concerns before widespread adoption. Through research and guidance, Brookings advocates for policies ensuring AI supports, rather than disrupts, educational values.
4. Leveraging AI to Improve Learner Outcomes Report
UPCEA and Instructure’s latest study reveals that while many educational institutions are keen on AI’s potential to enhance learner outcomes, nearly half have yet to adopt AI-driven tools. Key obstacles include academic integrity concerns, data privacy issues, and insufficient staff training. The findings underscore a growing interest in digital credentials and predictive analytics for student success, though only a small percentage have implemented comprehensive learner records.
5. AI for Education and Carnegie Learning Join Forces
Carnegie Learning and AI for Education have launched a partnership aimed at increasing AI literacy among K-12 educators across the U.S. The initiative will combine Carnegie Learning’s educational tools with AI for Education’s expertise in generative AI training, helping teachers understand and apply AI in classrooms responsibly. This collaboration addresses the high demand for AI resources in education and equips teachers to enhance student outcomes with cutting-edge technology.
6. SchooLinks Announces $80M Series B
SchooLinks, a platform for K-12 college and career readiness, recently raised $80 million in a Series B funding round led by Susquehanna Growth Equity. This investment will enable SchooLinks to expand its offerings, supporting school districts in preparing students for college and careers through tools like career assessments, academic planning, and workforce connections. Founded by Katie Fang, SchooLinks is gaining traction as districts nationwide seek comprehensive tools for student engagement and post-graduation readiness.
Join Edtech Insiders+
If you love Edtech Insiders and want to continue to support our work, consider becoming a paid subscriber! By joining Edtech Insiders+ you recieve:
Early access to all Edtech Insiders events. No more sprinting to sign up for our monthly happy hours, edtech summits, and online conferences.
Access to our exclusive WhatsApp channel to connect with the Edtech Insiders community and discuss the latest trends and news in our space.
All proceeds from subscriptions will help us invest in our podcast, newsletter, events, and community. Our goal is to make Edtech Insiders an enduring and sustainable community offering that connects the edtech ecosystem!
We would be so excited if you decide to join us as a member of Edtech Insiders+.
Interview: Stephen Jull
We have had some amazing guests on The Edtech Insiders Podcast in the last few weeks. One of our stand-out interviews is with Stephen Jull, the former COO/CFO and Co-Founder of GeoGebra (acquired by BYJU’S in 2021), and a limited partner at Emerge Education
Here’s a deep dive on our interview with Stephen Jull, and we encourage you to give the full episode a listen for more!
Founding GeoGebra and the Evolution of EdTech
Stephen Jull shares his journey as a co-founder of GeoGebra, a popular mathematics education software that made advanced math tools accessible to students and teachers globally. Through strategic partnerships and a community-centered approach, GeoGebra reached over 500 million users, fostering an era of tech-driven learning advancements, especially in STEM.
“It was 10 years of just putting your feet on the floor every day and making, you know, the math world go around...GeoGebra became the powerful tool we envisioned, thanks to the shared passion within our team and the broader education community.” - Stephen Jull
Navigating the BYJU’s Acquisition and Scaling Edtech Impact
Jull discusses the acquisition of GeoGebra by BYJU’S, an Indian edtech giant, and the shift toward an international impact. He reflects on the benefits and complexities of joining BYJU’S, especially as BYJU’S rapidly expanded globally and adapted to educational demands across different cultures.
“When you’re part of a larger system, you see that education and tech can go hand in hand to create opportunities for change... Byju himself was like the Alex Honnold of climbers. He was free soloing, breaking the glass ceiling... raising capital at a crazy pace.” - Stephen Jull
The Promise and Perils of AI in Education
Jull advocates for an AI-supported model of personalized learning that departs from traditional group instruction. He suggests AI could be the “missing link” in enabling differentiation for every student in a way that’s historically been impossible, creating a truly flexible, student-centered system.
“We’ve always had amazing learning communities...the missing link has been our inability to differentiate for every child. AI could finally be the tool that enables every student to find their passion and path each day in every subject.”
- Stephen Jull
Embracing Neurodiversity in EdTech Design
Highlighting neurodiversity, Jull emphasizes that designing for diverse cognitive styles should be a foundational element in educational technology. Neurodiverse students could benefit from systems that inherently accommodate non-linear thinking, which could also enhance learning for neurotypical students.
“With individuality comes opportunity...If we build AI models that don’t account for neurodiversity, we’re missing a chance to enrich learning for all. Neurodiversity isn’t a challenge to overcome; it’s an asset that could improve how we design educational tools.” - Stephen Jull
The Global Landscape for Educational Innovation
Jull identifies regions that are positioned to lead in edtech innovation, such as Finland, Singapore, the UAE, and China. He notes that each has its own unique approaches and willingness to adopt AI and other technologies in education, which could serve as models globally.
“We’re at a frontier where education systems that dare to take risks will likely lead the charge in transformative learning... whether it’s established leaders like Singapore or new innovators like the UAE, those willing to break traditional molds will shape the future.” - Stephen Jull
Curious to Learn More?
You can listen to our full interview with Stephen Jull, as well as interviews with many other edtech founders, investors, and thought leaders at The Edtech Insiders Podcast! Check it out, and as always, we’d love to hear what you think!
One of your best posts yet Alex! Good stuff. Keep it coming.
This article offers a clear look at the exciting role of AI layers in education. Moving from the “stack” concept to a “layered” approach makes it easier to understand how AI customizes learning experiences. The learning model layer especially stands out, promising to make AI tools more effective for teachers and students while raising questions about accessibility and adaptability.
For an example of how technology supports professional growth, check out www.tuteehub.com . It’s a great platform for students and professionals looking to build skills and access job opportunities. The potential of these AI learning layers could make education more engaging and personalized for everyone!