Big Data, Cloud Computing, Cybersecurity

Leveraging Cloud Platforms to Enhance Cybersecurity and Data Management


Cyberattacks have become a common occurrence for organizations around the world, affecting everyone from small nonprofits to entire cities. The U.S. government is no stranger to such attacks—two years ago, one state agency estimated they were receiving approximately 150 million attacks per day, and that number is estimated to have doubled over the past two years. The average cost per security incident for a public sector organization is $2.3 million—and that doesn’t include the ramifications on national security, citizen services, or the national economy.

Fortunately, many government agencies are responding to the rising threat by expanding their IT and security services to audit, log, and detect attempted cyberattacks on devices and systems. By gathering logs, organizations can query the data and get a better picture of a potential attack.

Google Cloud Platforms Blog ImageHowever, this practice is presenting agencies with a new challenge: finding a cost-effective and easily searchable storage method for the ever-increasing amounts of telemetry data. Legacy security information and event management (SIEM) software was not designed to ingest, process, and store the large amount of data pouring in. One solution exists in the cloud—a serverless, cost-effective, and highly scalable data processing engine with built-in analysis functions.

Growing Threats, Growing Data

Cyberattacks are easier than ever to carry out—anyone with a computer or mobile device and an internet connection can become a cyber threat. In-depth coding or hacking knowledge is no longer a barrier to entry as there are countless tools available with step-by-step guides to launching an attack, and even downloadable automated hacking tools.

Advanced adversaries can accomplish lateral movement in a victim’s environment shortly after the initial compromise of the network. This rapid infiltration speed makes it harder for affected agencies to detect breaches, and in 2019 on average it took public sector organizations months, if not years, to identify an attack, according to last year’s Verizon Data Breach Report.

The United States Computer Emergency Readiness Team (US-CERT) has issued detection and mitigation guidance to government organizations that stress the importance of logging all system actions and keeping that data for at least two years—this can help mitigate the impacts of cyberattacks and enhances the organization’s overall network security posture.

As organizations attempt to follow US-CERT guidance, they are running into the inevitable problem of data storage. The size of the data logs themselves are doubling every two years, and that rate is expected to continue or grow. It’s not unusual for large organizations to ingest hundreds of thousands of events per second, which equates to terabytes of volume per day.

The disparate state of system data also creates challenges in logging. Many organizations don’t have stringent logging practices in place, and sometimes data collection goes unused or unsynced. Collecting siloed data or data from networks and applications in different classification levels creates additional obstacles. The increasing growth of the Internet of Things infrastructure is producing additional data that needs to be logged. Most of this data is unstructured, meaning that even once it’s appropriately logged, it is difficult to query.

Scalable Storage Solutions

SIEMs, which have been the go-to software approach for reporting and responding to IT security issues, were not designed for the long-term storage and retrieval of the massive amounts of data being logged by organizations today. Using SIEMs for this type of data management is likely to be costly—many SIEMs charge licensing fees based on the amount of data ingested, which forces organizations to either pay for more storage or reduce the amount of data they store. And even if a SIEM can store all the logged data, there’s still the question of scalability and whether the software can handle the velocity of incoming data.

One solution to storing and analyzing high volumes of log data is a cloud-based data warehouse that allows for serverless storage. Such data warehouses are scalable, cost-effective, and often fully managed, so the organization does not have to handle the infrastructure of the data warehouse. These data processing engines can analyze large chunks of data at high speeds and analyze security events as they unfold by automatically ingesting log data and making them immediately available in the system. The cloud services often employ machine learning capabilities as well, allowing analysts to train machine learning models within the service without having to code out commands.

The US-CERT has made it clear that logging and analyzing broad swaths of operational data—and storing it for years—is one of the best ways to identify and track down breaches. Federal agencies are attempting to adhere to the guidelines, but legacy data processing systems lack the capacity and capability to cost-effectively collect, store, and query the data. By using specialized cloud computing services for data logging—as well as indefinite storage and near-instant querying of that data—the US-CERT’s mitigation strategy can be fully realized.

Citywide Cloud-Based Cybersecurity

New York stepped up its citywide cybersecurity posture when Mayor Bill de Blasio established the New York City Cyber Command in 2017. The organization works with city agencies to design and protect operational systems with cyber threats in mind, and uses cloud infrastructure to detect and mitigate threats.

NYC Cyber Command uses Google Cloud Platform services to collect and analyze terabytes of data each day and identify potential breaches or security concerns. Cloud Pub/Sub ingests data from participating agencies’ systems, and Cloud Dataflow formats the data to make it ready for analysis. BigQuery, a serverless, managed data warehouse, serves as the key analytical engine to quickly provide a deep and broad view of the city’s cybersecurity incidents. The organization also follows a zero-trust access model that uses cloud infrastructure for access management and a proxy tool to allow secure access to the applications without a VPN.

The combination of cloud-based tools is a scalable solution that will allow all New York City agencies to participate in the program within the next year, creating a vast network of data that will provide valuable insight into the city’s digital operations. The infrastructure will grow with the program, protecting the city’s critical infrastructure and providing valuable insights to strengthen New York’s digital security posture for years to come.

To learn more about enhancing cybersecurity and data management at your organization with cloud-based platforms, join our weekly webinar series by Carahsoft and Google Cloud.

Related Articles