Last updated at Mon, 06 Nov 2017 21:13:18 GMT
Overview / Motivation
A key part in moving the creation of a product from custom craftsmanship to a repeatable engineering process is the construction of a modular system with decoupling of its components, as well as these components being easily adaptable to inevitable changes. In this blog, I will draw on some of my recent experience of architecting a DAO layer as part of a software system, and how I tried to ensure the development of this system as a healthy and sustainable environment that will avoid common pitfalls in the long run. The aim of this article is less of a software development process overview, but rather the focus is on design and architectural considerations around the management of a system with components that grew big and complex. It will cover having to deal with increasing load while retaining the needed amount of reliability, performance, resilience and similar requirements that go with increased responsibility put on any organization dealing with growth.
Introduction
Every engineering product, be it a new airplane, a bridge or a software system has certain properties that must be considered as part of system design, and kept current through its entire lifecycle. For example maintainability, stability, robustness/elasticity, performance or scalability, to name a few.
Some of these must be defined in the inception phase of the project and remain an intrinsic part of system as key requirements throughout the system’s lifecycle such as planning, analysis, design, development, deployment and maintenance/retirement.
This article focusses on the design and architecture aspects of a DAO framework. A DAO subsystem is just one of several subcomponents in a software system, a key in enabling the processes around the design and implementation of higher level application components, while hiding the implementation of the underlying data persistence, e.g. a (NO)SQL database, file system etc. Carefully thought out architecture helps in responding to the requirements of the development process itself, meeting the required lifecycle properties mentioned above (e.g. no unacceptable reduction in performance as other components are added over time). It still allows agility for the team to seamlessly handle new or changing requirements or underlying technology (new versions of app stack, complete switchover to different stack, to name few) and to be adaptive to many different experiments technologists require in their research work and in general scientific work. Such Architecture should also include metrics to give valuable insight on the platform performances across different components and data for further analyses, should the business require such requirements in future.
DAO Layer
The DAO layer in software architecture is in charge of delivering the support for data manipulation to higher layer data consumers independently of the underlying data persistence layer. The idea is, should we decide to completely replace the data manipulation layer, other layers will not be even aware of it. It consists of a set of classes handling communication with the persistence layer, be it some (NO)SQL database, file system, or external REST service. It is also providing a consistent set of abstractions to the higher layer consumers of the DAO, streamlining their access options and unifying their access experience to a more standardized approach. In addition, providing information on the loads on components and its nature to decision makers and how to optimize, where to optimize, why…
In meeting architectural goals above, let us mention some requirements for a DAO layer that could be defined while architecting a system. I want to emphasize, those are not necessarily connected or combined, just mentioned here as an example of requirements that could be derived out of higher level architecture goals, such as “enhance software reliability & performance”, “enable software evolution”, “reduce maintenance costs”.. :
- consistency in interfaces as DAO implementation classes will be developed by many developers; people with different knowledge, needs, experiences and affinities, serving completely different persistence needs.
- proper error handling and alerting when a DAO class breaks its contract (returning null/out of range, throwing irregular exception etc.) and what needs to happen in which case.
- knowledge about the specific DTO (Data Transfer Objects) domain DAO class is handling, so we can run proper validation checks against domain objects produced by, or sent to it, i.e. forced parameterization of DTOs that must extend an abstract DTO class or implement DTO interface
- alerting if some DAO components are under performing to highlight bottlenecks in system. This requires monitoring system loads during runtime, with alert thresholds and profiling as required, so optimizations can be identified and evaluated.
- monitoring and alerting extreme memory consumption of a DAO component.
- a DAO class should only throw extensions of Framework defined exceptions, lets assume subclasses of ‘DaoException’. All other exceptions that DAO class (unintentionally) throws must be caught, logged, and it must be assured are not omitted by people responsible for system health. Exception that is not supposed to be thrown is a good sign something unpredictable happened and might be a flaw in component or even in framework design.
- thread usage (and connections) should be managed by Framework and monitoring put in place to prevent thread or connection starvation, as an example of addressing robustness or elasticity, an architectural decision that can have deep consequences on further developments. In any case, such responsibility should not be handed elsewhere to other parts of a system, developers or users of DAO component should not know or care about it.
- all DAO classes must get their dependencies injected through instantiation, not later at runtime, and dependencies must not be implementation classes. The DAO layer must ensure that bare minimum of dependencies DAO class needs does not exceed its domain, i.e. all dependencies should be interfaces related to persistence layer or other DAO interfaces only.
- caching and load balancing mechanisms must be implemented in the Framework and do not require the higher/lower layers to be aware of them. Framework managing DAO layer can make smart decisions as to how to distribute load and caching, so DAO component being integrated in DAO layer will have no knowledge about rules how/when it will be called to deliver its results or how will it be managed at all. Component is focused solely on delivering its logical functionality and borders of that ‘world’ should not be crossed. Let us assume DAO class can be serialized and persisted to a file system during shutdown of application, and later on transferred to another node and run inside it without being notified of complete environment change.
- A DAO class must not be aware of the thread it runs in. It can change even a VM in which it runs without being notified, so it must not have an impact to parts outside of its strictly defined functionality.
- The DAO layer must be able to incorporate new data persistence layers without impacting the higher layer consumers or add new consumers without having to re-implement the persistence layer, e.g. change of database/OS etc. (strict separation of concerns).
This list of requirements goes on, showing how crafting a good application framework is not a simple task; leaving long term consequences on developing and managing a system, having (good or bad) impact on team’s productivity and ability to address issues timely and with less (more) stress imposed on everyone involved in process, while also showing that the short-term gains appear limited. Meeting these requirements will, however, mean the DAO layer will meet the longer term desirable properties outlined above and allow the system it serves to grow and evolve as required.
The architecture will usually specify one generic DAO interface definition, that is generic enough to meet the business needs of each specific DAO subsystem, yet is defined sufficiently enough to cover the requirements for the management of all DAO components in the Framework, but that process imposes some of its own dangers, if we are careless, like any other part of a system. Lets delve into few of these issues.
Framework level
An example DAO that can be seen in the picture above, where the generic DAO interface is defined at the Framework level, meaning the Framework will be able to work only with Implementation classes of the DAO interface (and not ActionDao interface). So, Framework is software layer without which you can’t use your modules. If your class does not implement DAO interface it just cannot be wired into the system. The Framework is not aware of ActionDao interface nor its specifics. This DAO interface contains only one read() method accessible by Framework, which will return a specific DTO (Data Transfer Object) class. With this class intentionally defining only one method of the CRUD interface leaves the remaining basic persistence storage functions to be design decisions of the DAO Architect, reducing dependency between components with more distinct separation of responsibilities.
By defining read() method in DAO interface, here we are enabling Framework to be in a position to issue read() calls to components it supposes to manage, but not directly interact with them, so effectively here we are making mistake by empowering Framework to misuse those components having access to data, which is dangerous and not needed, since that is not the function of Framework being in charge of managing vital system components, and opens path to dangerous violations.
Potentially, being able to do that, developer of Framework management functionalities will come to idea to cut shortcuts in resolving his problems of managing components by accessing their functionalities directly; why not store cached data to database before killing a component, what can go wrong with that, right? Or, why not reading some sensitive data when we are in position to do so, avoiding business rules and security checks imposed in some other layers, not everything has to go through MVC and be logged..
Another view is that the read() method does not, however, belong at the Framework level as it is not related to the Framework’s operations, unlike methods initialize() or shutDown(). Since read() is effectively needed in all DAOs it is put here as an example of a potentially bad design decision that might come at price in future, because first it opens functionality in a place where it is not needed (or is dangerous to be used) and second forces all DAO implementation classes to implement it, regardless of whether they need it. Potentially bad architectural decisions will generate some extra burden to development side; and exposes system to being vulnerable and bug prone, allowing someone to use functionalities in places where it is not expected to (as shown above, inside framework which is in charge of managing DAOs, not manipulating). Making good, thorough framework development is as essential as making good software that actually uses it. Opposite to implementation classes, framework classes are meant not to go through big and often changes, long time stability is what comes in mind with their development, so their process might differ to classic software development and often change in those classes leads to moving forward with new “big” version software release.
Architecture level
Figure 5 now altogether…
Architecture is not just about composition of different frameworks and their design/management. It is more of a solution to questions of product stewardship and offering responses to future development requirements, while keeping promises of required business and engineering properties our product has to serve during its evolution in delivering required services. It is also about external technologies and how are those integrated into the product, how easily they can be replaced with one another; how quickly and precisely we can find and address all potential differences in system behavior, not just drafting specs along our strategic goals and making sure it ‘aligns’… Our example shows the ActionDao component defined as one that will handle Action objects (which must implement the DTO interface) and
findObsolete() method as the one implementation class that needs to be implemented in order to be ActionDao ‘compatible’ e.g. that is the only way it can be wired into the system as a DAO module for handling actions. This interface is extending the framework related interface DAO which is not necessarily visible by the implementation class, but note that its read() method must be defined in the implementation class and needs to deliver precise answer when called. By implementing both methods, developer makes his class ‘pluggable’ into the system, both at the Architecture and Framework level. Having well defined interfaces at the Architectural level allows us to easily develop a new set of implementation classes that will communicate with some other persistence layer, be it completely new software stack or a mocked implementation of one. There is no way we can omit some functionality, class just won’t compile. That way we can respond transparently to technology changes without having a negative impact on the existing behaviours of the system.
Having several sets of implementation classes serving the same purpose means we can easily switch sets or adapt to new situations even during runtime without needing to stop system at any point, if required to. Usually systems tolerate restart of applications, have failover redundancies – so it is not a big deal not being able to shutdown working application, it is even considered to be dangerous to have such requirement, but architecturally we might get into situation that shutdown is not acceptable and application needs to be rewired ‘hot’ while running, question is: how to deliver on such requirement without disturbing several subsystems and components that could span across different teams working in different areas, serving different requirements.
Changing rules means adding new unpredicted deadlines to those teams, all those disturbances will cause several chain reactions you probably could not have control over later, resulting in many new problems no one was aware, bringing production dates far away from what was in business plan, affecting further organizational change delays and financial plans, manager’s nightmares as something nobody likes to see happening. So, it is always best to have anyone depending on a change not to care about it, even not to be aware of it. If that is not achievable, we should all then question ourselves, how were we planning it?
Conclusion
Hopefully this example has shown that to be able to better manage evolution of a complex system we need to have some sort of management over our DAO components and has also shown the importance of modularity and careful interface design to enable for future flexibility and easier refactoring, which not only pertains to DAO subsystem, rather any subsystem in general. If you build a DAO class to cover some specific functionality, your class cannot be plugged into the system unless it implements a specific DAO interface, because it just can’t ‘fit’ into Framework’s entry interfaces and the moment it extends that interface dozens of rules emerge that a developer must satisfy in order for a class to be pluggable into the system. More important, once class is wired in, system takes care of all possible problems that may arise from usage of it. Given that the developer may be clumsy, oversee some details or not implement some methods at all, then it is important that the framework will detect irregularities in a class’s behaviour when it is plugged in and do something with it. Such as simply turn off that instance, replace it with older one, safely stop the system, or send an appropriate warning. Similarly, a developer may not develop sufficient unit tests or even avoid testing of his work and plug in a buggy component, so the system must be able to defend its robustness or performances or detect unusual behavior, with mechanisms to detect early and recover from problematic components.
In any case, central point here is separating two ‘worlds’ and making them independent of what is going on in each camp is hugely important in achieving productive surrounding and maintainable products that will not grow overly complex as system evolves. Adding new functionality or changing existing one can turn into a huge nightmare if we act in unorganized manner intermingling different sorts of responsibilities, it is imminent for things to get too tangled and it will be nearly impossible to introduce change without causing disturbances to functionalities on different levels affecting business flow and consequently hurting income flow and/or company reputation, which what we strive to avoid.
Future extensibility is provided by the fact that we can evolve how the Framework is managing its own DAO instances, without the need to ‘inform’ those DAO instances about significant changes in behavior of its management component. This gives us freedom to incrementally develop components of a system independently, without fear of reaching a point where system cannot evolve further without huge interventions in many parts, resulting in new bugs and affecting the system’s stability. Not less important thing to mention – versions dependencies – it is always a huge plus if one module does not have to be aware in any way that there is new version of module it depends on.
Its always soothing to know that pulling a newer version of that module from the codebase has an effect where no dependency is changed at all, everything works as before, maybe only a little faster, with less memory consumption and hey, there is that new functionality I can use to resolve some of my problems… Modules can evolve separately and independently, which can have huge benefits on continuous integration. On the other hand, it allows DAO components to evolve independently of its ‘management system’ too, a Framework that controls its usage (manages it) and to be completely refactored from the inside without affecting outside behavior, hence any functionality of a system, allowing us to rethink internal function of the component, improve it using new algorithm, switch to another database or scale an application across several new instances, without affecting other components depending on functionalities that DAO class provide, reducing inter-team communication and potential showstoppers affecting our agenda.
Start capturing and analyzing all of your log data today with a free Logentries account.