How Palantir Meets IL6 Security Requirements with Apollo

Building secure software requires robust delivery and management processes, with the ability to quickly detect and fix issues, discover new vulnerabilities, and deploy patches. This is especially difficult when services are run in restricted, air-gapped environments or remote locations, and was the main reason we built Palantir Apollo.

With Apollo, we are able to patch, update, or make changes to a service in 3.5 minutes on average and have significantly reduced the time required to remediate production issues, from hours to under 5 minutes.

For 20 years, Palantir has worked alongside partners in the defense and intelligence spaces. We have encoded our learnings for managing software in national security contexts. In October 2022, Palantir received an Impact Level 6 (IL6) provisional authorization (PA) from the Defense Information Systems Agency (DISA) for our federal cloud service offering.

IL6 accreditation is a powerful endorsement, recognizing that Palantir has met DISA’s rigorous security and compliance standards and making it easier for U.S. Government entities to use Palantir products for some of their most sensitive work.

The road to IL6 accreditation can be challenging and costly. In this blog post, we share how we designed a consistent, cross-network deployment model using Palantir Apollo’s built-in features and controls in order to satisfy the requirements for operating in IL6 environments.

What are FedRAMP, IL5, and IL6?

With the rise of cloud computing in the government, DISA defined the operating standards for software providers seeking to offer their services in government cloud environments. These standards are meant to ensure that providers demonstrate best practices when securing the sensitive work happening in their products.

DISA’s standards are based on a framework that measures risk in a provider’s holistic cloud offering. Providers must demonstrate both their products and their operating strategy are deployed with safety controls aligned to various levels of data sensitivity. In general, more controls mean less risk in a provider’s offering, making it eligible to handle data at higher sensitivity levels.

Palantir IL6 Security Requirements with Apollo Blog Embedded Image 2023

Impact Levels (ILs) are defined in DISA’s Cloud Computing SRG as Department of Defense (DoD)-developed categories for leveraging cloud computing based on the “potential impact should the confidentiality or the integrity of the information be compromised.” There are currently four defined ILs (2, 4, 5, and 6), with IL6 being the highest and the only IL covering potentially classified data that “could be expected to have a serious adverse effect on organizational operations” (the SRG is available for download as a .zip from here).

Defining these standards allows DISA to enable a “Do Once, Use Many” approach to software accreditation that was pioneered with the FedRAMP program. For commercial providers, IL6 authorization means government agencies can fast track use of their services in place of having to run lengthy and bespoke audit and accreditation processes. The DoD maintains a Cloud Service Catalog that lists offerings that have already been granted PAs, making it easy for potential user groups to pick vetted products.

NIST and the Risk Management Framework

The DoD bases its security evaluations on the National Institute of Standards and Technology’s (NIST) Risk Management Framework (RMF), which outlines a generic process used widely across the U.S. Government to evaluate IT systems.

The RMF provides guidance for identifying which security controls exist in a system so that the RMF user can assess the system and determine if it meets the users’ needs, like the set of requirements DISA established for IL6.

Controls are descriptive and focus on whole system characteristics, including those of the organization that created and operates the system. For example, the Remote Access (AC-17) control is defined as:

The organization:

  • Establishes and documents usage restrictions, configuration/connection requirements, and implementation guidance for each type of remote access allowed;
  • Authorizes remote access to the information system prior to allowing such connections.

Because of how controls are defined, a primary aspect of the IL6 authorization process is demonstrating how a system behaves to match control descriptions.

Demonstrating NIST Controls with Apollo

Apollo was designed with many of the NIST controls in mind, which made it easier for us to assemble and demonstrate an IL6-eligible offering using Apollo’s out-of-the box features.

Below we share how Apollo allows us to address six of the twenty NIST Control Families (categories of risk management controls) that are major themes in the hundreds of controls adopted as IL6 requirements.

System and Services Acquisition (SA) and Supply Chain Risk Management (SR)

The System and Services Acquisition (SA) family and related Supply Chain Risk Management (SR) family (created in Revision 5 of the RMF guidelines) cover the controls and processes that verify the integrity of the components of a system. These measures ensure that component parts have been vetted and evaluated, and that the system has safeguards in place as it inevitably evolves, including if a new component is added or a version is upgraded.

In a software context, modern applications are now composed of hundreds of individual software libraries, many of which come from the open source community. Securing a system’s software supply chain requires knowing when new vulnerabilities are found in code that’s running in the system, which happens nearly every day.

Apollo helped us address SA and SR controls because it has container vulnerability scanning built directly into it.

Figure 1: The security scan status appears for each Release on the Product page for an open-source distribution of Redis

When a new Product Release becomes available, Apollo automatically scans the Release to see if it’s subject to any of the vulnerabilities in public security catalogs, like MITRE’s Common Vulnerabilities and Exposure’s (CVE) List.

If Apollo finds that a Release has known vulnerabilities, it alerts the team at Palantir responsible for developing the Product in order to make sure a team member updates the code to patch the issue. Additionally, our information security teams use vulnerability severity to define criteria for what can be deployed while still keeping our system within IL6 requirements.

Figure 2: An Apollo scan of an open-source distribution of Redi shows active CVEs

Scanning for these weak spots in our system is now an automatic part of Apollo and a crucial element in making sure our IL6 services remain secure. Without it, mapping newly discovered security findings to where they’re used in a software platform is an arduous, manual process that’s intractable as the complexity of a platform grows, and would make it difficult or impossible to accurately estimate the security of a system’s components.

Configuration Management (CM)

The Configuration Management (CM) group covers the safety controls that exist in the system for validating and applying changes to production environments.

CM controls include the existence of review and approval steps when changing configuration, as well as the ability within the system for administrators to assign approval authority to different users based on what kind of change is proposed.

Apollo maintains a YML-based configuration file for each individual microservice within its configuration management service. Any proposed configuration change creates a Change Request (CR), which then has to be reviewed by the owner of the product or environment.

Changes within our IL6 environments are sent to Palantir’s centralized team of operations personnel, Baseline, which verifies that the Change won’t cause disruptions and approves the new configuration to be applied by Apollo. In development and testing environments, Product teams are responsible for approving changes. Because each service has its own configuration, it’s possible to fine-tune an approval flow for whatever’s most appropriate for an individual product or environment.

Figure 3: An example Change Request to remove a Product from an Environment

A history of changes is saved and made available for each service, where you can see who approved a CR and when, which also addresses Audit and Accountability (AU) controls.

When a change is made, Apollo first validates it and then applies it during configured maintenance windows, which helps to avoid the human error that’s common in managing service configuration, like introducing an untested typo that interrupts production services. This added stability has made our systems easier to manage and, consequentially, easier to keep secure.

Incident Response (IR)

The Incident Response (IR) control family pertains to how effectively an organization can respond to incidents in their software, including when its system comes under attack from bad actors.

A crucial aspect to meeting IR goals is being able to quickly patch a system, quarantine only the affected parts of the system, and restore services as quickly as is safely possible.

A major feature that Apollo brings to our response process is the ability to quickly ship code updates across network lines. If a product owner needs to patch a service, they simply need to make a code change. From there, a release is generated, and Apollo prepares an export for IL6 that is applied automatically once it’s transferred by our Network Operations Center (NOC) team according to IL6 security protocols. Apollo performs the upgrade without intervention, which removes expensive coordination steps between the product owner and the NOC.

Figure 4: How Apollo works across network lines to an air-gapped deployment

Additionally, Apollo allows us to save Templates of our Environments that contain configuration that is separate from the infrastructure itself. This has made it easy for us to take a “cattle, not pets” approach to underlying infrastructure. With secrets and other configuration decoupled from the Kubernetes cluster or VMs that run the services, we can easily reapply them onto new infrastructure should an incident ever pop up, making it simple to isolate and replace nodes of a service.

Figure 5: Templates make it easy to manage Environments that all use the same baseline

Contingency Planning (CP)

Contingency Planning (CP) controls demonstrate preparedness should service instability arise that would otherwise interrupt services. This includes the human component of training personnel to respond appropriately, as well as automatic controls that kick in when problems are detected.

We address the CP family by using Apollo’s in-platform monitoring and alerting, which allows product or environment owners to define alerting thresholds based on an open standard metric types, including Prometheus’s metrics format.

Figure 6: Monitors configured for all of the Products in an Environment make it easy to track the health of software components

Apollo monitors our IL6 services and routes alerts to members of our NOC team through an embedded alert inbox. Alerts are automatically linked to relevant service logging and any associated Apollo activity, which has drastically sped up the remediation process when services or infrastructure experience unexpected issues. The NOC is able to address alerts by following runbooks prepared for and linked to within alerts. When needed, alerts are triaged to teams that own the product for more input.

Because we’ve standardized our monitors in Apollo, we’ve been able to create straightforward protocols and processes for responding to incidents, which means we are able to action contingency plans quicker and ensure our systems remain secure.

Access Control (AC)

The Access Control (AC) control family describes the measures in a system for managing accounts and ensuring accounts are only given the appropriate levels of permissions to perform actions in the system.

Robustly addressing AC controls includes having a flexible system where individual actions can be granted based on what a user needs to be able to do within a specific context.

In Apollo, every action and API has an associated role, which can be assigned to individual users or Apollo Teams, which are managed within Apollo and can be mirrored from an SSO provider.

Roles necessary to operating environments (e.g. approving the installation of a new component) are granted to our Baseline team, and are restricted as needed to a smaller group of environment owners based on an environment’s compliance requirements. Team management is reserved for administrators, and roles that include product lifecycle actions (e.g. recalling a product release) are given to development teams.

Figure 7: Products and Environments have configurable ownership that ensures the right team is monitoring their resources

Having a single system to divide responsibilities by functional areas means that our access control system is consistent and easy to understand. Further, being able to be granularly assign roles to perform different actions makes it possible to meet the principle of least privilege system access that underpins AC controls.

Conclusion

The bar to operate with IL6 information is rightfully a high one. We know obtaining IL6 authorization can feel like a long process — however, we believe this should not prevent the best technology from being available to the U.S. Government. It’s with that belief that we built Apollo, which became the foundation for how we deploy to all of our highly secure and regulated environments, including FedRAMP, IL5, and IL6.

Additionally, we recently started a new program, FedStart, where we partner with organizations just starting their accreditation journey to bring their technology to these environments. If you’re interested in working together, reach out to us at fedstart@palantir.com for more information.

Get in touch if you want to learn more about how Apollo can help you deploy to any kind of air-gapped environment, and check out the Apollo Content Hub for white papers and other case studies.

This post originally appeared on Palantir.com and is re-published with permission.

Download our Resource, “Solution Overview: Palantir—Apollo” to learn more about how Palantir Technologies can support your organization.

Palantir Announces Availability of Foundry on Microsoft Azure

Amid global economic uncertainty, access to integrated, protected, and trusted data and analytics is more vital than ever when it comes to creating business value. To further enable transformative outcomes, Palantir is pleased to partner with Microsoft in making Palantir Foundry available on Microsoft Azure, empowering existing and new customers to more effectively apply data and analytics in their operational decision-making.

Through this new collaboration, organizations will be able to quickly deploy Palantir Foundry — our ontology-powered operating system for the modern enterprise — as well as being able to unlock further value in Azure Data Services with Microsoft’s cloud-scale analytics and AI solutions.

As part of this relationship, our Foundry platform is available on Azure, enabling customers to deploy our software at speed, while benefiting from Azure’s trusted and secure infrastructure, as well as its global commercial footprint.

Availability on the Azure Marketplace will enable seamless purchasing and invoicing, with customers able to use their existing Microsoft Azure Consumption Commitment (MACC) to purchase a Foundry license and infrastructure costs.

Foundry’s single view ontology can layer on top of Azure Data Services, where they can then use investments for faster time to value, by better unlocking insights, and predicting and simulating outcomes for more data-driven decision making.

Palantir Foundry on Microsoft Azure Blog Embedded Image 2023

The platform will also integrate with native Azure Data Services for enterprise data management on Microsoft Azure, such as Azure Data Lake, Azure Synapse Analytics, Microsoft Power BI, Microsoft Dynamics 365, Microsoft Teams, and Microsoft Industry Clouds. This means customers will be able to further build on their existing IT investments in Azure Data Services through Palantir’s software-defined data integration (SDDI) to products like Azure Synapse Analytics, Azure Data Lake Storage, Azure AI and Azure Machine Learning, alongside others.

“We’re pleased to partner with Palantir to bring Foundry to Microsoft Azure. Organizations around the world will be able to make their data more actionable by using Palantir’s platform for data-driven operations and decision making, powered by Azure’s cloud-scale analytics and comprehensive AI services.” — Deb Cupp, President, Microsoft North America

Better Together with Palantir Foundry and Azure Data Services

Our new relationship with Microsoft will also see us go to market together in joint opportunities across industries like energy and renewables, retail and CPG, as well as other cross-industry sustainability and ESG efforts, where Microsoft customers can enhance their existing digital transformation efforts in Azure Data Services:

  • Energy and Renewables: Foundry enables customers to integrate data at speed and scale from remote sensors and Azure IoT Hub, apply this data to drive up the efficiency of assets, from offshore oil to onshore wind.
  • Retail and CPG: The platform enables organizations to bring near-instant visibility into demand and the ability to adapt their promotions, inventory, and operations in real time.
  • Sustainability and ESG: We’re helping organizations in their net zero transition by creating a common carbon ontology to empower front line decision makers to adjust their work to meet emissions targets.
  • Healthcare and Life Sciences: Foundry is used across the healthcare and life sciences value chain, from drug discovery and development, through to manufacturing, marketing, and sales. Integrate with Azure Health Data Services to manage protected health information.

We are also working together to accelerate time to value for customers in these industries any many more, by consolidating SAP and other ERPs using Palantir HyperAuto, helping them to create a more integrated data landscape. Palantir HyperAuto can help customers accelerate their journey to SAP on Azure and quickly surface insights in just hours.

Partnership in Action

Additional Palantir Foundry capabilities that can be deployed at speed via Azure include those from customers like the connected vehicle company Wejo. Wejo is a proud Palantir partner, optimizing Foundry’s capabilities, and a global leader in Smart Mobility for Good™ cloud and software solutions for connected, electric, and autonomous vehicle data.

Their data comes from over 92 billion vehicle journeys and consist of more than 19.5 trillion data points to data that provide businesses and organizations across a variety of industries the power to innovate, drive growth, transform communities, and save lives.

“We want to help reduce the 1.3 million deaths that happen each year on the road and the additional 8 million due to emissions with smart mobility for good products and services. As part of the Foundry platform, we are excited that Palantir customers with Azure will be able to more rapidly drive integrated, protected, and trusted data and analytics from Wejo for smart mobility initiatives and business value.” — Sarah Larner, Executive Vice President of Strategy and Innovation at Wejo

We look forward to working with Microsoft to broaden Foundry’s availability, enabling clients across industries to better leverage their existing investments for improved operational outcomes.

Those interested in learning more about Palantir and Microsoft’s relationship can visit the Palantir website or get started today via the Azure Marketplace.

This post contains forward-looking statements within the meaning of Section 27A of the Securities Act of 1933, as amended, and Section 21E of the Securities Exchange Act of 1934, as amended. These statements may relate to, but are not limited to, expectations regarding the terms of the partnership and the expected benefits of the software platform and solutions. Forward-looking statements are inherently subject to risks and uncertainties, some of which cannot be predicted or quantified. Forward-looking statements are based on information available at the time those statements are made and were based on current expectations as well as the beliefs and assumptions of management as of that time with respect to future events. These statements are subject to risks and uncertainties, many of which involve factors or circumstances that are beyond Palantir’s control. These risks and uncertainties include Palantir’s ability to meet the unique needs of its customers; the failure of its platforms and solutions to satisfy its customers or perform as desired; the frequency or severity of any software and implementation errors; its platforms’ reliability; and the ability to modify or terminate the partnership. Additional information regarding these and other risks and uncertainties is included in the filings Palantir makes with the Securities and Exchange Commission from time to time. Except as required by law, Palantir does not undertake any obligation to publicly update or revise any forward-looking statement, whether as a result of new information, future developments, or otherwise.

This post originally appeared on Palantir.com and is re-published with permission.

Download our Resource, “Impact Study: Accelerating Interoperability with Palantir Foundry” to learn more about how Palantir Technologies can support your organization.

Updates from Palantir Edge AI in Space

In April 2022, Palantir launched its Edge AI solution into space onboard Satellogic’s NewSat-27 as part of the SpaceX Transporter-4 mission. We’re excited to provide an update of our on-orbit imagery processing efforts. Between April and July, we performed various hardware and software tests in-orbit, and over the past few months we have been receiving some exciting results from our direct tasking and on-orbit processing pipelines onboard NewSat-27.

Where We Stand

As of November 2022, we have successfully demonstrated the capability for customers to task the satellite with multiple captures, resulting in over 100 images from NewSat 27’s multispectral camera.

We had our most recent live image capture and onboard processing test on October 30th over Tartus, Syria. Let’s run through how we handled these images starting from the raw capture in-orbit all the way to results on the ground, utilizing Edge AI in space:

Raw images captured by the satellite consist of a single channel comprising four different ‘bands’ of information — these represent a specific wavelength of light. Palantir Edge AI then orchestrated our onboard imagery preprocessing services to convert batches of raw images into standard, three-channel RGB images. By processing images into a standardized format that our models expect, we can improve accuracy and create more confident results for our users. As part of this specific capture, we received 44 images that we processed into six RGB images.

Palantir Edge AI in Space Blog Embedded Image 2023

After pre-processing was completed, we then ran AI models onboard the satellite. For this particular capture, Edge AI ran our in-house Palantir Omni model to identify buildings in the images. We received 210 building detections, or ‘inferences’, from the model. For each inference, our post-processing services created PNG thumbnails and computed geodetic coordinates by using the satellite telemetry and the onboard global elevation datasets. The outputs were then bundled and secured using various onboard cryptographic mechanisms, so we could validate the data once it was received on the ground.

In our initial on-orbit tests, we discovered an edge-case bug in our pre-processing algorithm. To remedy the issue, we uplinked a small software patch to the satellite that modified how we converted these individual images into RGB images. Once our patch was uplinked, we were able to update our software onboard to account for this new case within seven minutes. With the upgrade infrastructure in-place, we can continuously refine and augment our in-orbit software and algorithms.

Notably, in this live capture instance, we were to demonstrate that software capacity for customers to process all 44 frames within 7 minutes. In our previous post, we discussed how we had strict time constraints for each individual processing run of Edge AI. Even when we accounted for the update, our end-to-end processing time was comfortably within the thresholds that we had initially targeted. For even larger captures, our software features a built-in checkpointing system for resuming processing in the event that we have to halt processing.

What’s Next?

While this previous version of our Omni model was geared towards identifying buildings of interest and focused on the onboard integration with the satellite, our next generation of in-house models can identify more specialized object classes, such as ships. These models are already running on the ground as we test their performance. We ran this same capture through one of our newer models and were able to identify various ships near the port of Tartus in Syria with high confidence. We will be sending this new model up to the satellite in our next upgrade cycle. This will allow us to demonstrate Edge AI’s ability to continuously update and manage models while in flight, in order to optimize inference results based on areas of interest.

Figure 1: Ships off the coast of Tartus, Syria. Detections come from Palantir’s new in-house ML models on imagery collected as part of our Tartus capture.

We have also integrated our Edge AI outputs with Palantir MetaConstellation. MetaConstellation provides end-to-end software around satellite imaging, including an operational UI for image analysis. It allows users to annotate imagery with features and easily compare multiple images from different vendors and sensors over a given area of interest.

Our outputs from the AIP Satellite — either the combined image with detections, or just the PNG thumbnails — can be viewed directly within MetaConstellation. This means that in future deployments we could be able to directly downlink from an Edge AI-equipped satellite to a tactical instance of MetaConstellation in the field, allowing for detections and imagery to be sent to operational users within minutes.

Palantir MetaConstellation makes imagery analysis readily accessible to users. Here, we compare imagery from our Tartus capture on October 30, 2022 with images that we had previously collected on September 17, 2022.

Figure 2: Palantir MetaConstellation makes imagery analysis readily accessible to users. Here, we compare imagery from our Tartus capture on October 30, 2022 with images that we had previously collected on September 17, 2022.

Our Ongoing Commitment

We are continuing to invest in our on-orbit capabilities and are currently focused on hardware-backed security mechanisms, upgraded model capabilities, and our in-house georegistration algorithm, which should dramatically increase the accuracy of our model inferences. We are also planning to introduce new communication options to facilitate direct downlink for data, which will allow Palantir to get inferences into the hands of our customers faster than ever before.

This post contains forward-looking statements within the meaning of Section 27A of the Securities Act of 1933, as amended, and Section 21E of the Securities Exchange Act of 1934, as amended. These statements may relate to, but are not limited to, expectations regarding the expected benefits and uses of our software platforms. Forward-looking statements are inherently subject to risks and uncertainties, some of which cannot be predicted or quantified. Forward-looking statements are based on information available at the time those statements are made and were based on current expectations as well as the beliefs and assumptions of management as of that time with respect to future events. These statements are subject to risks and uncertainties, many of which involve factors or circumstances that are beyond Palantir’s control. These risks and uncertainties include Palantir’s ability to meet the unique needs of its customers; the failure of its platforms and solutions to satisfy its customers or perform as desired; the frequency or severity of any software and implementation errors; its platforms’ reliability; and the ability to modify or terminate the partnership. Additional information regarding these and other risks and uncertainties is included in the filings Palantir makes with the Securities and Exchange Commission from time to time. Except as required by law, Palantir does not undertake any obligation to publicly update or revise any forward-looking statement, whether as a result of new information, future developments, or otherwise.

This post originally appeared on Palantir.com and is re-published with permission.

Download our Resource, “Resilient and Effective Space Capabilities” to learn more about how Palantir Technologies can support your organization.

Enabling Responsible AI in Palantir Foundry

Editor’s Notes: The following is a collaboration between authors from Palantir’s Product Development and Privacy & Civil Liberties (PCL) teams. It outlines how our latest model management capabilities incorporate the principles of responsible artificial intelligence so that Palantir Foundry users can effectively solve their most challenging problems.

At Palantir, we’re proud to build mission-critical software for Artificial Intelligence (AI) and Machine Learning (ML). Foundry — our operating system for the modern organization — provides the infrastructure for users to develop, evaluate, deploy, and maintain AI/ML models to achieve their desired organizational outcomes.

From stabilizing consumer goods supply chains, to optimizing airplane manufacturing processes, and monitoring public health outbreaks across the globe, Foundry’s interoperable and extensible architecture has enabled data science teams worldwide to readily collaborate with their business and operational teams, enabling all stakeholders to create data-driven impact.

Palantir Responsible AI in Foundry Blog Embedded Image 2023

As we discussed in a previous data science blog post, using AI/ML for these important use cases demands software that spans the entire model lifecycle. Foundry’s first-class security and data quality tools enable users to develop AI/ML models, and by establishing a trustworthy data foundation, our software offers the connectivity and dynamic feedback loops that these teams need in order to sustain the effective use of models in practice.

Further to this, developing capabilities that facilitate the responsible use of artificial intelligence is an indispensable part of building industry-leading AI/ML capabilities. Here, we’ll share more about what responsible AI means at Palantir, and how Foundry’s latest model management and ModelOps capabilities enable organizations to address their most challenging problems.

Responsible AI at Palantir

At its core, our AI/ML product strategy centers around developing software that enables responsible AI use in both collaborative and operational settings. We believe that the term has many dimensions and includes considerations around AI safety, reliability, explainability, and governance. We’ve publicly advocated for a focused, problem-driven approach as well as the importance of robust data governance to AI/ML in multiple forums.

We believe that the tenets of responsible AI are not just limited to model development and use but have considerations throughout the entire model lifecycle. For example, developing reliable AI/ML solutions requires tools for the management and curation of high-quality data. These considerations extend beyond model deployment alone and include how end-users interact with their AI outputs and how they can use feedback loops for iteration, monitoring, and long-term maintenance.

Incorporating responsible AI principles in our software is also a core part of our commitment to privacy and civil liberties. Building this kind of software means recognizing that AI is not the solution to every problem and that a model for one problem will not always be a solution to others. A model’s intended use should be clearly and transparently scoped to specific business or operational problems.

Moreover, the challenges of using AI for mission-critical problems span a variety of domains and require expertise from a diverse breadth of disciplines. Building AI solutions should therefore be an interdisciplinary process where engineers, domain-experts, data scientists, compliance teams, and other relevant stakeholders work together to ensure the solution represents the specialized demands and requirements of the intended field of application. The values of responsible AI shape how we build our software, and in turn, they enable our customers to use AI/ML solutions in Foundry for their most critical problems.

Model Management in Foundry

Building on the platform’s robust security and data governance tools, Foundry’s model management capabilities are designed to encourage users to incorporate responsible AI principles throughout a model’s lifecycle. We have recently released product capabilities that improve the testing and evaluation ecosystem through no-code and low-code interfaces. We encourage you to read more about these here.

Problem-first modeling

In Foundry, orienting around the “operational problem” that models are trying to solve is at the heart of this new model management infrastructure. Foundry offers many tools for a data-first and exploratory approach to model experimentation, but for mission-critical use-cases, AI/ML applications need to be scoped to a specific problem. We have deliberately built modeling objectives to focus model development, evaluation, and deployment around well-defined problems.

The Modeling Objectives application enables users to define a problem, develop candidate models as solutions to these challenges, perform large-scale testing and evaluation, deploy models in many modalities to both staging and production applications, and then monitor them to enable faster iteration.

Specifying the modeling problem from the outset enables collaborators to better understand — and test for — the application and context for which the models are intended. This also provides greater insight into inadvertent reuse or repurposing of models. Modeling objectives provide a flexible yet structured framework that presents an opportunity to streamline model development and deployment by collecting key datasets, identifying stakeholders, and creating a testing and evaluation plan before their development begins.

These objectives also transparently communicate state about a particular AI/ML solution — from model development to testing, to deployment and further post-deployment actions like monitoring and upgrades. This enables users to be more intentional, responsible, and effective in how they use AI to address their organization’s operational challenges.

Deep integrations for security and governance

Data protection, governance, and security are core components of Palantir Foundry and are especially important for AI/ML. AI solutions must be traceable, auditable, and governable in order to be used effectively and responsibly. To facilitate this, Foundry’s model management infrastructure integrates deeply with the platform’s robust capabilities for versioning, branching, lineage, and access control.

Users can submit a model version to an objective and propose that model as a candidate solution for the problem defined in that objective. When submitting a model, users are encouraged to fill out metadata about the submission which becomes part of its permanent record. Project stakeholders and collaborators can use this to better understand the details of each submission and create a system of record that catalogs all future models for a particular modeling problem. With Data Lineage, they can also quickly see the provenance of every model that is submitted to an objective, revealing not only the models themselves, but also their training and testing data and what sources those datasets originally came from.

Foundry’s model management infrastructure natively integrates with the platform’s security primitives for access controls. This enables multiple model developers, evaluators, and other stakeholders to work together on the same modeling problem, while maintaining strict security and governance controls.

Robust testing and evaluation capabilities

Testing and evaluation (T&E) is one of the most critical steps in any model’s lifecycle. During T&E, subject matter experts, data scientists, and other business stakeholders determine whether a model is both effective and efficient for any given modeling problem. For example, models may need to be evaluated quantitatively and qualitatively, assessed for bias and fairness concerns, and checked against organizational requirements before they can be deployed to applications in production environments. That’s why we have released a new suite of capabilities to facilitate more effective and thorough T& in Foundry.

Foundry now offers evaluation libraries for common AI/ML problems as a part of the Modeling Objectives application. The availability and native integration of these libraries within Foundry’s model management infrastructure enable users to quickly produce well-known, quantitative metrics in a point-and-click fashion for common modeling problems, all without having to dive into any technical implementation.

We’ve also included a framework for users to write their own custom evaluation libraries. Libraries authored in this framework benefit from the same UI-driven workflow and integration with modeling objectives. This extends the power of the integrated evaluation framework to more advanced modeling problems or context-specific use cases.

Building on the evaluation library integrations, we’ve also added the ability to easily evaluate models across subsets of data. This lets users quickly and exhaustively compute metrics to identify areas of model weakness that might otherwise go undetected if only computing aggregate metrics. Evaluating models on subsets can more easily surface bias or fairness concerns that affect only a portion of the model’s expected data distribution. Users can also configure their T&E workflows to run automatically on all candidate models proposed for a problem in order to build a T&E procedure that is both systematic and consistent.

We also recognize that not all T&E procedures are quantitative. Therefore, checks in modeling objectives help keep track of certain pre-release tasks that might need to get done as part of the T&E process before a model can be released.

Looking ahead

Modeling objectives and the T&E suite are just some of the latest capabilities to encourage responsible AI in Foundry, and we continue to invest in new capabilities for effective model management. From the tools that facilitate robust model evaluation across domains, to mechanisms for seamless model release and rollback in production settings, our model management offering will always focus on empowering our customers to use their AI/ML solutions effectively, easily, and responsibly for their organization’s most challenging problems.

This post originally appeared on Palantir.com and is re-published with permission.

Download our Resource, “Palantir Named a Leader in AI/ML Platforms” to learn more about how Palantir Technologies can support your organization.

Safely Modernize Legacy Systems with Palantir Foundry Container Engine (FCE)

Missile warnings. Airplane flight statuses. Satellite observation alerts. Much of the U.S. Government’s most critical digital infrastructure is dependent on software built during the Cold War, written in archaic languages (e.g., Fortran, COBOL, ADA), and/or installed exclusively on mainframe computers. While the infrastructure is old and may struggle to keep up with the needs of today, the core logic often works well. Yet re-writing decades of work and millions of lines of code to try to modernize just isn’t feasible.

Introducing Foundry Container Engine (FCE): FCE runs containerized legacy code in Foundry, enabling government agencies to leverage what’s working and safely leave behind what isn’t. For an analogy, consider AWS Lambda, which revolutionized how engineers run code by abstracting away the hardware infrastructure required — no more worrying about servers and clusters. In a similar fashion, FCE is revolutionizing how engineers integrate and orchestrate legacy investments in their modern software architectures. FCE streamlines your modernization journey, allowing you to incrementally rebuild millions of lines of legacy code while continuously delivering new value to the organization.

The Challenge: Operationalizing legacy code is hard

Code that is decades old is not inherently bad. On the contrary, it’s been battle-tested over decades and written by people with deep expertise in highly specialized fields. Yet aligning old software to the changing operational realities of today is both daunting and necessary. It’s often untenable work to re-write and scale up the satellite model that was built to detect 100 satellites in the 1980s and now needs to detect 30,000 satellites in 2023.

Our customers who rely on legacy code and infrastructure frequently face the following challenges:

  • Modernizing is disruptive: Too often, the only options presented for modernization are highly disruptive — data lakes, code base overhauls, and multi-year roadmaps. These run the risk of taking critical systems offline without ever accomplishing the necessary operational outcomes.
  • Unscalable: There is a long lead time for up-sizing environments to meet computation requirements, and scaling replicas of instances of products is often impossible. Forget about doing this in real-time to meet today’s critical deadlines.
  • Siloed logic: Sophisticated legacy models are much more valuable when integrated with other data sources. In our satellite example, this might include observation data, surveillance networks, sensors, and more. Adding new data feeds, data processes, outputs, and interfaces is unfeasible or too slow to be valuable.
  • Closed ecosystem: The data pipelines associated with legacy code are often a black box. There is no way for other platform and development teams to securely collaborate, effectively limiting upside by restricting the number of people able to interact with the code and provide novel analyses.
  • Divorced from operational decisions: Legacy models produce compelling insights, but the outputs are not actionable. There is no easy way to automatically create an intuitive visualization or useful alerting logic. A collision model might show satellites are about to collide, but users cannot action this information to re-orient where those satellites are flying.

Solution: Use FCE to lift and shift logic to Foundry and make it 100x more valuable

As a centrally-managed, cloud-based SaaS platform, Foundry offers instant access to cutting edge modern software implementation, including streaming pipelines, live API-driven inference, and autoscaling. Now that FCE allows containerized code to run in Foundry, unlocking the full value of legacy systems has never been easier.

Day 1 benefits include:

  • Safely and incrementally modernize: Immediately start your modernization journey with the assurance that critical systems will continue to function, and deprecation of old components will do no harm.
  • Rapidly scalable infrastructure: Achieve on-demand expansion of your compute and storage environment as capabilities evolve and expand. This provides resiliency and redundancy to avoid a single point of failure. Replacing one file with another in the FAA’s flight software should not cause flights to be grounded nationwide.
  • Flexibility and interoperability: Seamless addition of future data feeds, data processes, objects, schemas, and interfaces. Fuse disparate data to quickly produce new analyses.
  • Secure collaboration: Built-in security access control features enable secure collaboration among combined platform and development teams. When combined with pipeline transparency and DevSecOps iteration, customers can securely democratize outputs over open, extensible APIs.
  • Modern and dynamic user interfaces for rapid and automated decision-making: Users easily configure alerting logic and produce new applications with low-code/no-code tooling. Translate the complex output of a satellite physics model into an operationally relevant Space Domain Awareness application.

In a silo, legacy software can still be improved, but those small gains come at the expense of the significant, compounding benefits of modernization. FCE enables agencies to rapidly speed up progress towards their software-driven outcomes by integrating anything run by FCE with other Foundry products (e.g., pipeline builder, streaming, workshop). With Foundry’s core principles of modularity and interoperability, agencies can selectively deprecate legacy software components without disrupting their data sources, ontology, and actions. In a world where the missiles are parabolic one month and hypersonic the next, innovation in bits must outpace innovation in atoms.

This post originally appeared on Palantir.com and is re-published with permission.

Download our Resource, “Impact Study: Accelerating Interoperability with Palantir Foundry” to learn more about how Palantir Technologies can support your organization.