Entitlement to Data

J2EE Journal, 2008

The requirements for different user-facing applications frequently say something like: “User has to see/read/be shown only funds/records/itineraries/policies he or she is entitled to.” Permissions in these cases usually depend on multiple factors related to the user profile (job role, locale, etc.), to the protected data (data origin, storage, approval status, etc.), or to both. This represents the fine granular entitlement requirements that are rarely supported by commercial systems.

In this article I will discuss different methods of entitlement to persistent data. The described example allows Java developers to construct controlled access to data using known Java tools. The example may be viewed as an extension to existing entitlement systems or as a lightweight solution for entitlement to data.

Business Task
When referring to a business task, business processes assume that people may read, modify, publish, and massage data when they need to. It sounds very simple; however, it’s not a simple technical task at all. Data access requests usually come from the business without any considerations for implemented data models in data storage, data normalization, and distribution. For instance, as we know, database schemas at the enterprise level reflect common needs while data access control must support specific requirements of every business unit.

Sometimes a lot of factors and rules have to be applied to find what data the user is entitled to. Plus, different user activities may require different entitlement rules. The complexity of the rules varies a lot. An example from the financial industry combines a job assignment rule for an accountant with the so-called “Asia Data Access” rule, which may have several exceptions. The combined rule set has to be applied to every financial transaction to find out which transaction the accountant may work with.

If we use a Rule Engine for this task, we simplify the rule set management dramatically. However, every transaction out of millions per day has to be challenged in the Rule Engine. The efficiency of this approach is doubtful; processing may require a lot of computational resources; it can take a lot of time while it may not be scalable. I didn’t even mention resources that have to be spent on controlling the rule consistency and compliance.

If the Rule Engine is not used and we code rules directly, we still have to deal with each individual transaction and with the individual user’s profile to calculate entitlement. The number of possible combinations of the financial and user’s data becomes enormous and not really manageable.

Design Considerations
An entitlement system may be built based on different models such as user- or resource-centric, and role- or rule-based. In spite of the mode, every entitlement system operates with users and resources/assets. The purpose of an entitlement system is to allow users to access only the assets they are permitted to access or to protect assets from unauthorized access by the users.

In the case where the asset is a persisted data record, we face a challenge – the amount of such assets may easily exceed billions and grow every day by millions as we mentioned before for the financial transactions. It’s obvious that treating each of the transactions as an asset and defining rules to access it is unrealistic. The case gets even worse when we cannot a priori identify every user of the data and grant him or her access rights based on identity. It usually happens when the user’s access rights are defined by the rules based on user characteristics such business role and locale.

From a security perspective, proper data access control requires the data to be selected before it is obtained by the application that’s available to the user. Several methods of selection may be created but all of them fall into two categories: inside the data storage/database or outside of it. For instance, if the selection is made at the object level, it appears as quite insufficient, potentially poor in performance, and insecure because the full data set, which the user is not supposed to have access to, should be retrieved from the database into the application, e.g., as Value Objects, and, only then, selected. The Value Objects become vulnerable and easier for intruders to access. If we choose to perform the selection inside the database via predefined data filters, we significantly improve performance but anticipate tremendous risk because the decision of “who may access what” happens after the request “gets into” the database. If the request/query is altered on its way, we open the database either to malicious operations or to malicious data access.

To minimize the security risk and provide acceptable performances, an entitlement system has to control data queries, that is, only authorized queries are executed inside the data storage. The job of the entitlement system in this case is to maintain the association between users and the queries they are entitled to use. Moreover, if the user’s and asset’s characteristics can be used as the query parameters, it allows for the individualized selection of a large amount of data while controlling a relatively small number of selectors/filters.

The ideally secured solution would be to intercept a request for data on its way to the database and dynamically compose or modify the query based on the user’s entitlement to data (see Figure 1). This solution is equally effective if data is in single storage or if data is spread over multiple heterogeneous sources. For single storage, we can use a regular object-relation mapping tool like Hibernate. For a distributed data source, we can use something like IBM WebSphere DataStage with a custom adapter and a defined data filter or BEA LiquidData with XQuery. However, for simplicity, we will discuss the single data storage case in the following section.

Example of Data Entitlement Implementation
Let’s consider a Web application that is supposed to display a portfolio of personal investments. The application gets data from multiple database schemas of a single Oracle database via a Data Access Layer ( DAL ). A DAL is based on an object-relational product such as Hibernate 3.0. The content of the individual portfolio is protected by an entitlement system (ES). That is, every user of the application is registered with the ES, has his or her user profile, and may be entitled to a particular set of funds. Figure 2 shows the relationship and interactions between the application, ES and DAL.

In particular, an end user invokes a Web application (or a business application) via a remote client like a Web browser and, upon authentication, requests the fund information from the portfolio. The application processes the request and prepares the response. The fund data needed for the response is obtained from the database that’s accessible via the DAL. The latter invokes the ES to authorize the request.

In the ES, user entitlement to the funds is interpreted as an association between the user and a set of data filters. The data filter is a meta-definition of a real SQL filter, i.e., part of the “WHERE” clause. The data filter comprises the filter name and the ordered list of the filter parameters. To fill in filter parameters, the ES obtains the user profile and matches the user profile’s attributes with the data filter parameters as shown in Figure 3. An example of the Filter class is shown in Listing 1.

In Listing 1, while the filterName variable is self-explanatory, the java.util.HashMap is constructed with filter parameter names as the keys and the parameter values organized in java.util.ArrayList. This allows the use of multi-valued filter parameters.

The ES returns a list of data filters the user is entitled to. At this moment, the DAL does not know which data filter to apply. The DAL iterates through the received data filter collection and, using the Hibernate API, enables matching filters defined in the DAL. For every enabled filter, the DAL sets filter parameters using values obtained by the ES from the user profile. The fragment of related code is shown in Listing 2. If no exceptions are thrown, the DAL runs the HQL query using the session.createQuery(…).set[ObjectKeyType].list() or session.get(…) API that, in turn, generates a SQL query, executes it against the database, and returns filtered results.

As we see, the DAL is not just a Hibernate implementation. The ability to control data access and even to load-balance and engage parallel processing for complex requests distinguishes the DAL from an O/R mapping tool.

It is a fair concern about the performance of the DAL. However, if you really need to secure data access, the performance degradation caused by the entitlement controls should be compensated by the application architecture. It might include additional data caching and buffering, optimization of the database schema, and the usage of denormalization in views. This is why it’s very expensive to add security later on in the application life cycle instead of considering it at the architecture and design phases.

Thus, data is selected inside the database with the top performances and the selection is performed only by the filters the user is entitled to. Of course, you can construct a different implementation of the data filter. For instance, a JDBC call may be intercepted and the SQL query may be enriched with entitlement information on the fly, or the solution may be based on stored procedures instead of dynamic queries generated by Hibernate. Nevertheless, it should not change the security model, e.g., the stored procedure name and input parameters to be provided by the security system based on user entitlements.

Conclusion
I have demonstrated how to control access to millions of data records in persistent data storage via entitlement to a few data filters. The data access layer can combine parameterized data filters with user profile information and construct individualized and secured queries executed with maximum performances inside the data storage. This represents an example of how entitlement to a data solution may be implemented based on user profile information and an O/R mapping tool.

References

Hibernate: Relational Persistence For Idiomatic Java: www.hibernate.org/
IBM WebSphere DataStage: www-306.ibm.com/software/data/integration/datastage/
Poulin, M. “How to Deal with Security When Building Application Architecture.” JDJ, Vol.10, issue 9
BEA AquaLogic Enterprise Security: www.bea.com/framework.jsp?CNT=index.htm&FP=/content/products/aqualogic/security/
BEA Liquid Data for WebLogic 8.1 Documentation: http://e-docs.bea.com/liquiddata/docs81/index.html

Example of the “Asia Data Access” Rule

The Law prohibits disclosure of the local client’s data outside of the country, except for the following circumstances:

Customer-written consent obtained
Internal audit
Performance of risk management estimation:
-Users performing risk management functions with a “need-to-know”
Operational functions outsourced
Proposed restructure, transfer or sale of credit facility

Entitlement to Data

Leave a comment

Cancel reply

Share this:

Related

Leave a comment

Cancel reply