Trap The Spark…

December 28, 2008

Persistence Layer

Gregari’s persistence layer consists of the following:

  • Database schema 
  • Value objects written as EJB3 Entity objects (mapped to the database via Java Persistence/Hibernate 3 annotations)
  • DAO (Data Access Objects) written as EJB3 SSB (Stateless Session Beans)

Value Objects - The value objects are generated using the Castor XML open source project and XML Schemas.  This was done as its relatively quick to model objects using a single file (XML Schema) and generate the following:

  • classes
  • attributes
  • attribute getters/setters
  • XML  marshal/unmarshal code
  • simple field validation
  • etc.    

Castor XML – Using Castor XML and XSD’s (XML Schema files) XML elements are mapped to Java classes, XML attributes are mapped to Java properties, XML element relationships (with other XML elements) are mapped to collections of objects (1:1 or 1:n depending the design).  So if your familiar with XML Schemas… Gregari is taking full advantage of the XSD specifications.  Castor XML is merely taking the XSD’s and generating Java classes.  The normal Castor XML configuration is leveraged, i.e.

  • Mapping XML namespaces to Java packages
  • configuring which type of collections to be used
  • create a “hashcode” method (true)
  • create an “equals” method (true)
  • inherit from a common base class
  • etc.

All of this can be seen in the castorbuilder.properties file specification from Castor XML

Components - From a modeling perspective … there’s components (logical groupings of objects).  In Gregari a single XML Schema is mapped 1:1 with a component.  Within the XSD file are the component’s objects as well as relationships to each object (where applicable).  Furthermore by using XDoclet2 tags within the XSD description element, the XDoclet annotations will be generated in the class, attribute and/or method JavaDoc comment sections.  XDoclet annotations that are used include those for Hibernate as well as some Gregari-custom XDoclet tags.  By adding the XDoclet tags… a post processor (custom written leveraging XDoclet’s parsing capabilities) will transform the Castor generated JavaBeans (POJO’s) into EJB3 Entity Beans complete with annotations (both EJB, Java Persistence, Hibernate ORM, Hibernate Search, Hibernate Validations, JBoss Seam, as well as some Gregari specific Java5 Annotations).  In addition to the Gregari XDoclet post-processor (that transforms the POJO’s into EJB3 Entities), Hibernate Tools is used to generate Hibernate configuration/mapping files as well as generate database specific DDL SQL.

Build Process -  To generate value objects… the build process has the following steps:

  • From XSD files generate Java POJO’s using Castor XML Code Generator
  • From POJO’s transform into EJB3 Entities using a Gregari custom XDoclet plugin
  • From EJB3 Entities generate Hibernate configuration/mapping files (to generate DB specific DDL SQL) using Hibernate Tools
  • From Hibernate configuration/mapping files generate DB specific DDL SQL using Hibernate Tools

Gregari Schema component – All of this can be found in the Gregari Schema component.  Child components include the following (their individual mapping to databases is explained below while the database strategy is explained in the Database Strategy blog entry).

  • Common – Contains common XSD files used to generate common POJO’s.
  • Activity – Contains XSD’s specific to the user interface, i.e. metadata signifying what’s present as well as what can/cannot be customized by an application tenant.  These objects are associated with the Metadata database.
  • Audit – Contains XSD’s specific to any objects used to track system usage, i.e. user sessions, user request, user authentication/authorization attempts, data accesses, debug statements, scheduled job audits, etc. These objects are associated with the Audit database.
  • Database – Contains any DDL SQL as well as DbUnit files used to create test data for any unit tests. There are no POJO objects within this component.
  • Etc – Contains XSD’s that specific XML document root elements, i.e. placed in a common place.  Not all root elements are here.
  • Instance – Contains XSD’s specific to any objects used to create multi-tenant data, i.e. notes, addresses, UDF’s, etc.  Presumably this could be placed in a single tenant shard or kept together in a multi-tenant shard. These objects are associated with the Instance database.
  • Metadata – Contains XSD’s specific to signifying what’s present in the system as well as what can/cannot be customized by an application tenant.  Also keeps track of any customizations. These objects are associated with the Metadata database as explained in the Database Strategy blog entry.
  • Reporting – Contains XSD’s specific to the reporting component, i.e. metadata signifying what report categories, reports and report parameters are present. These objects are associated with the Metadata database.
  • Scheduler – Contains XSD’s specific to the scheduler component, i.e. metadata signifying what job categories, jobs, scheduled jobs are present. These objects are associated with the Metadata database.

EJB3 Entity Final Notes - As noted earlier the EJB3 Entities are just basically POJO’s with getter/setter methods, XML marshal/unmarshal methods, and simple field validation.  There is no complex validation nor business logic within these objects as they are generated.  However, they are heavily annotated with Java5 Annotations to trigger behavior via Interceptors, etc.  Furthermore the value objects have no understanding of an underlying persistence mechanism, i.e. they could be mapped against a database or marshaled into XML and sent to an ESB.  Any persistence via a database or an ESB would need to be handled in other objects, i.e. DAO’s, etc.

DAO (Data Access Objects) - DAOs wrap the interaction between the value object and the persistence mechanism’s (in this case Hibernate).  Normally you create a well defined interface and then whatever concrete implementations.  So you could swap out implementations as needed (i.e. talk to the database, or LDAP, or an ESB).  That’s the theory at least.  So in these DAO’s an javax.persistence.EntityManager is injected. This EntityManager is tied with a javax.persistence.PersistenceContext. The PersistenceContext in turn is tied with an underlying database. For Gregari only one EntityManager/PersistenceContext is injected per DAO (this helps as there are multiple databases and want to make sure the right object goes into the right database. The layer above (i.e. BO or Business Object layer) handles the orchestration of multiple DAO’s in a given unit of work.  The BO object is not tied to a specific database, rather it can work with multiple DAO/EntityManager/PersistenceContext tuples.

The DAO’s typically have the following methods:

  • get a list of objects (given any parent ID’s as well as a QBE (query by example) search pattern)
  • get a specific object based on its ID
  • get a specific object based on its business key (code, tenant ID, etc)
  • insert an object
  • update a specific object (for UI’s that drill down into the object details and potentially update its contents)
  • update a set of objects (for UI’s, etc. that support bulk updates)
  • delete a set of objects (for UI’s, etc. that allow multiple deletes)

Generally speaking this is set up for your normal QBE Search (Query By Example) as well as CRUD operations (Create, Read, Update, Delete).  Very specific searches (get this object’s children that are active) can be placed in the BO (with the QBE parameters hardcoded) thus reusing the “get a list of objects” DAO method.  Where reuse doesn’t make sense then create new BO/DAO methods to handle the query.  But ideally the BO objects have far more methods to handle specific business logic than the DAO’s.  If the DAO’s have a lot of methods then… you may need to reconsider your model.  In addition the DAO’s should be reusable across BO’s (sure there might be a BO 1:1 mapping with DAO’s to do CRUD type business logic.  But for more complex business logic… the BO might need multiple DAO’s.

Annotations – The DAO objects don’t have a lot of Java 5 annotations, currently just EJB3 Stateless Session annotations, JBoss Seam annotations and some Azzura Gregari specific annotations for checking that the right tenant’s users are seeing the right data.  The annotations take into account the following concepts, i.e.

  • Provider data – Data that only the SAAS provider should see.
  • Global data – Data that any tenant can see, i.e. metadata regarding the system (entities, attributes/methods/relationships on the entities, etc)
  • GlobalLocal data – Data that normally is global but can be overridden by a particular tenant.  Overridden data should only be visible to the tenant whereas global data should be available to all tenants.  Essentially there could be a mixture of global and local data within the same result set (typically metadata).  An example might be for a status attribute enumeration, i.e. global values would be “active” or “inactive”.  But a tenant might have added “deleted” to satisfy one of their internal needs.
  • Local data – Data that should only be visible to the tenant.

In subsequent blogs I’ll describe the service layer, add some diagrams and code snippets.


Apache Maven vs. Apache Ant (Part 2)

Earlier I blogged about my recent experiences with Apache Maven, i.e. Apache Maven vs. Apache Ant.  Several of the major issues were had were:

  • Performance – Maven’s performance (compared to Apache Ant) is horrible.  It repeats steps throughout the build (doesn’t skip anything and worse repeats some steps).  It’s constantly calling out to the Internet looking for  SNAPSHOT’s (per target/goal) as if somehow it magically changed mid-build.  A hack for repeated SNAPSHOT searches is to review all dependencies you have to remove as many SNAPSHOT’s on 3rd party components (wherever possible).  This isn’t foolproof as some of your dependencies might have dependencies on SNAPSHOT’s.  (For the ultimate hack you could tweak all of the POM’s in question but that’s painful and error prone and the next time you update something it could all get blown away.)  Another hack is to shut off your Internet connection.  The lack of a connection makes Maven stop looking (brilliant).
  • Scalability – Maven is a memory hog. The more complex (i.e. more nested components leveraging the reactor) your build process becomes the more memory consumed.  Here you need to tweak MAVEN_OPTS environment variable.

I was recently able to make some performance improvements to help Apache Maven, i.e.

  • Got a new laptop App MacBook Pro (2.6Ghz, 4 G RAM, HD w 7200 RPM)… not sure what exactly has sped things up but I suspect its the hard drive (as everything else is marginally better, i.e. 0.30 GHz faster CPU, 1G more RAM).
  • Tweaked the MAVEN_OPTS 

Originally I had the following MAVEN_OPTS settings: setenv MAVEN_OPTS “-Xmx640m” but I was getting errors for “out of perm space” so I used the following MAVEN_OPTS settings:

setenv MAVEN_OPTS “-Xms128m -Xmx640m -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=640m -Xverify:none”

But as the number of components were built it locked the system.  I suspect the issue is the garbage collection settings.  So I reduced the MAVEN_OPTS settings to the following:

setenv MAVEN_OPTS “-Xms128m -Xmx640m -XX:MaxPermSize=640m”

This immediately helped get further in the build process.  Eventually it ran out of heap memory so now the MAVEN_OPTS settings are as follows:

setenv MAVEN_OPTS “-Xms128m -Xmx1280m -XX:MaxPermSize=640m” which seems to have done the trick.  I’m now able to do complete builds (mvn clean install) with these settings without doing hacks like shutting off my network connections, tweaking POM’s, etc.

But… need to get this working for creating sites, (i.e. mvn site) as well as getting this running with integration tests… (I have quite a few).

In the meantime I’ve been reading the Ant In Action book (which is quite good).  I’ve basically re-examined my entire Ant build process (and theory).  So I’ve started creating scripts from scratch trying to incorporate concepts from the book, i.e.

  • Declaring “beaucoup” ANT properties that can be overridden in a property file (or passed in properties from parent ANT scripts).
  • Importing ANT tasks (both life cycle tasks that can be overridden as needed as well as common functionality).
  • Potentially leveraging IVY for dependency management.
  • Potentially leveraging SmartFrog for deployment management (conceptually cool project).
  • Getting JBoss Seam’s testing to work with ANT (shouldn’t be an issue as JBoss Seam uses ANT primarily).

Fingers crossed but making progress…

December 15, 2008

Gregari’s Database Strategy (multiple databases, multi-tenancy, shards, etc.)

Gregari is meant to show an enterprise application that has SaaS (Software As A Service) capabilities.  In creating Gregari’s database strategy I actually created several databases, i.e.

  • Metadata – System configuration (metadata specific to provider functionality or global metadata which would be common to all tenents) as well as tenent specific metadata.. resides within this database.  Generally speaking nothing in here is confidential and if it is those database columns can be encrypted (i.e. user passwords if stored in a single database rather than LDAP would be a good example).
  • Audit – Any auditing information also goes to a single database.  Only the system is creating these objects/rows, i.e. user sessions, security authentication/authorization attempts, data access at the row attempts, debug statements from logs, Web site requests, Web Service requests, scheduler job audits, etc.  While there’s information about the system and the tenants/users who use it… this is more about audit information, i.e. system resource, tenant, user, timestamps, success/failure, etc.
  • Instance – Any tenant specific information that is specific to the SAAS application is stored here.  An example might be … if the SaaS application in question is a CRM application… then stored in the Instance databases would be any leads, opportunities, quotes, orders, etc.  The Metadata database might have the tenant has subscribed to the CRM application’s Marketing, Sales and Support modules as well as what employees have subscriptions but that should be the extent of what’s stored in the Metadata application.
  • 3rd Party Products – Components like Quartz and JBOSS Jbpm both can leverage a database.  These database objects are stored separately since they’re separate projects/products.
  • Reporting – Any reports or data feeds pull their data from this database.  A single ETL job pulls data from the metadata/instance/auditing databases as needed.  This way the metadata/instance/audit/etc databases can be optimized for OLTP operations and the reporting database can be optimized for … reports/bulk queries.

Since Gregari is using EJB3 Entities via JBoss Hibernate’s Entity Manager… this produces an interesting issue, i.e. EJB3 Entities are associated with a “persistence context”.  The persistence context is associated with one and only database (if I understand the specification correctly).  From a packaging perspective there can only be one persistence context per JAR file that contains an EJB-JAR.XML and PERSISTENCE.XML file.  As such any EJB3 Entities need to be packaged in the correct JAR file… the JAR file that’s associated with the corresponding database/persistence context.  

The other major concept to consider is foreign key relationships… Normally in a database designed for SaaS… you have the tenant table associated with each applicable object/table via a foreign key.  But if the tenant object/table is in the Metadata database and the CRM opportunity object/table is in the Instance database… there’s an issue of foreign keys.  Sure there might be some database vendor specific way to handle it.  But Gregari attempts to do this in a database vendor neutral manner by creating some “hidden” tables to mimic the tenent table in each of the other databases.  As an example:

  • Metadata database has the Tenant table.  All tables within the Metadata database reference (where necessary) this Tenant table via foreign key relationships.
  • Audit database has a AuditTenant table.    All tables within the Audit database reference (where necessary) this AuditTenant table via foreign key relationships.
  • Instance database has a InstanceTenant table.    All tables within the Instance database reference (where necessary) this InstanceTenant table via foreign key relationships.

Whenever a Tenant object/row is created a corresponding AuditTenant as well as InstanceTenant object/row are created and inserted (via the Service layer generating/observing events) and the reverse is done if the Tenant is deleted from the Metadata database.  The AuditTenant and InstanceTenant objects/tables are pretty dumb, i.e. primary key column, tenantId, created by/on, updated by/on… and that’s it.  This concept is extended to any other duplicate tables between databases, i.e. User/AuditUser/InstanceUser.  Apart from inserts/deletes these tables are never touched, i.e. there’s no other data to keep in sync.  As mentioned earlier this is all handled in the Service layer via Seam Events/Observers allowing decoupling between business objects.  The event generation/observers design pattern was picked to allow an agnostic approach to the database (i.e. don’t use any database specific capabilities to join two databases).  This allows portability between database vendors.  It’s also the basis of event driven systems allowing decoupled development.

Other concepts include multi-tenancy, i.e. basically there’s a Tenant object/table and all objects/tables that have tanent-specific data … reference that Tenant object/table in a foreign-key relationship.  Pretty simple.  So as long as your queries reference the tenant ID… everything is grand.  If you don’t … you’ll have issues.  So development and testing are critical to ensure this.

Still other concepts include the notion of shards…  for some tenants having all of their data co-existing with other tenants is OK as long as there’s some way to keep it all segregated.  Other tenants will insist that the data is kept completely separate, i.e. another database.  For JEE this can be be difficult as… there’s a single database connection configuration, i.e. driver, database name, system account username/password.  It’s very difficult to have separate databases per tenant without a configuration/coding nightmare.  Ideally there’s some magically way to support both single-tenent databases and multi-tenant databases using one set of code…  Hibernate Shards is meant to do this but currently doesn’t have support for JPA (Entity Manager, etc).  Other options are Oracle VPD (which I just learned about).  So this concept of tenant specific databases is a bit more problematic but… possible with some thought/some code.  Key things to consider:

  • Design for provisioning – Ideally any provisioning is relatively instantaneous.  Waiting up to 24 hours for a dead period in the operating cycle isn’t ideal.. (or waiting for a weekend)
  • Design for operations – Ideally the system doesn’t have to come down in order to add the single tenant database to the overall system.  Bringing a system down for anything but a major upgrade isn’t ideal.  Opportunity for a FUBAR situation.  Especially if there’s a large cluster of servers without automated configuration automation.
  • Database connection pooling – Ideally there’s no impacts to any connection pooling.

In subsequent blog’s I’ll detail how object history, user defined fields, etc. is/was designed.

December 14, 2008

Java5 Annotations

In building Gregari I’ve tried to leverage two basic concepts

  • Generate as much as I can via Xdoclet (or Castor) at build time, ideally any boilerplate code.
  • Write the rest but leverage abstract classes or Java5 Annotations/Interceptors.

Assuming one is familiar with Java Annotations… prior to the Java 5 release if you wanted to use the annotation concept you could do the following, i.e.

  • at build time use Xdoclet tags @blah.blah-blah my-attribute=”hi” within the JavaDoc section
  • at runtime use Spring AOP Annotations (amongst others) by @@MyAnnotation( my-attribute=”hi”)

But then came Java 5 with built-in annotation support and so I converted all of my Spring annotations over to the Java 5 standard.  But I still had a lot of ugly Xdoclet tags remaining in my code.   After reading Manning’s “Seam In Action”, i.e. http://www.manning.com/dallen/… I was struck on how JBOSS Seam leveraged Java5 Annotations.  So I basically got rid of my mixture of XDoclet tags (for code generation) and Java5 Annotations (for runtime)… and instead went to exclusively Java5 Annotations.  It even sped up XDOCLET2 code generation by approximately a factor of 10.  The downside is perhaps I went overboard… but not sure what else to do…

So I use Java5 Annotations for the following major concepts:

  • Taxonomies – I have two major taxonomies, i.e. to distinguish one type of objects from another (i.e. business objects from data access objects, mainly done for code generation purposes).  The other major taxonomy is for what capabilities the object supports so that any cross cutting Interceptors can act accordingly.  Essentially these taxonomy annotations are “marker interfaces”.  These taxonomy annotations are both at the class and method levels… where appropriate.
  • Metadata – In order to customize/configure/personalize enterprise applications (such as in a SAAS model where you can configure the application… you first need to know what’s possible.  By embedding the code with annotations… it should centralize code/metadata in one place.  As a part of a build process (or other) the code can be parsed and the metadata stored in configuration files, databases, etc.
  •  JEE specific annotations (EJB’s, Transactions, Entities)
  • Hibernate (ORM, Validation and Search)
  • JBOSS Seam specific annotations
  • System management – One of the issues for 24/7 Operations (DFO-Design For Operations) is that developers write the code but operations supports it (yet there’s little hand off between the two groups).  What if development annotated their code to create health models, instrumentation models, task models, etc. so that a management system could be generated to support the code being developed.  Perhaps just another way to support Extreme Programming, i.e. your code is self documenting.  Normally this is just for other developers… but operation/development engineering (i.e. developers who create software to optimize system management) is another audience.
  • Interface generation – A Java best practice is to have all concrete objects implement an interface.  Other objects reference the interface instead of the concrete object.  This allows many cool things, i.e. mock objects, etc.  Normally the interface is hand written but… if the concrete interface were the master… you could generate its supporting interface (assuming this is the first implementation of that interface).  Interface generation would be optional based on if the marker annotations were present or not.  So… this is done at the class level to signify an interface as well as the method level to signify what methods should be included in the interface.
  • Method level behavior, i.e. should encryption/decryption be implemented, caching, user-defined-fields (UDF’s), history, notes, addresses, list of values (enumerated values), security, etc.  JEE Interceptors will detect this and act accordingly.
  • Page level behavior, i.e. by leveraging the Java5 concept of package-info.java’s at the package level, I can implement component level annotations.  All sub-packages presumably are of the same “component”.  And thus my leverage component component level behavior (where applicable).  I use this at the system management level to create statistics by component.

In subsequent blog postings (to this one) I’ll give specific examples of what this looks like.

Apache Maven vs. Apache Ant

For doing software builds in the Java world… most people use either Ant or Maven.  I’ve used both…

From a requirements perspective I need a continuous integration/build process that can do the following:

  • Dependency Management – In one place identify the components and their versions that are included in this project.  Ideally the same dependency management system can deal with child dependencies automatically.
  • Support some form of reuse across the components (in terms of the build scripts, ideally there’s no large scripts that have redundant text across the commands, difficult to maintain/keep consistent)
  • Support for hierarchical projects composed of numerous components (must scale and allow complexity).
  • Allow quick builds (most perform)
  • Ideally create a project Web site (documentation)
  • At least create project reports per component.

I started in the Ant world, i.e. oodles of ANT scripts on big projects.  My main issue was the complexity of the ANT scripts after awhile and the lack of reuse when you had a multi-component.  But it was fast and it could do anything, i.e. powerful.  I experimented on/off with Maven over the years.  Finally got it working this year.  Everything as advertised except really slow when there’s an Internet connection (and just slow in general).  And needed to break large projects up and build them separately (or them crash early, i.e. PermSpace errors).  With command lines like the following:

MAVEN_OPTS “-Xms128m -Xmx640m -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=640m -Xverify:none”

It didn’t crash on the mini-builds… but when trying to build everything it would just slow down and ultimately lock the machine.  I was really attracted to the Development Web site that was created with mvn site but if the system can’t perform nor scale… it’s not worth it.  Looking at Seam and Hibernate.. they appear to get by just fine.  And then there’s trying to get the JBOSS Embedded Server running in Maven…  I’ve searched alot/experimented and just not sure what I’m missing there.  Time to thrown in the towel.  So now I’m going to move back to Ant.  In the past few years… it appears as though Ivy has caught on so perhaps Dependency Management.  Appears to be some templating/reuse concepts (although that may have been there awhile and I overlooked it).  Not sure how much can be templated so will be interested to investigate.

So in retrospect… Maven is nice for small projects.  Mine appears massive (i.e. many components, a lot of code generation, etc.).  So time to go back to Apache Ant.  Also need to get a wiki like http://www.SeamFramework.org and http://in.relation.to (i.e. the documentation posted to Al Gore’s Internet).

December 12, 2008

Layers

Filed under: Concepts — Mark De Lanoy @ 6:16 am

In creating Gregari as an Enterprise class Web Application… we needed a few architectural layers.  But before I knew it there were quite a few layers…  Actually layers within layers.  When creating architectures there’s a number that should be considered, i.e. the logical architecture (i.e. web servers, application servers and database servers) and the physical (or deployment) architecture (i.e. the actual physical hardware/network devices/layers).  You might have a three tier logical architecture but it all resides on one box or split out across three physical boxes.  Some examples might included the following:

  • When doing prototypes I might have a three tier architecture but everything is physically installed on my laptop (three logical layers, one physical layer)
  • When doing test on a budget you might have web and application server on the same physical server and the database on a dedicated server.  So three logical layers on two physical boxes
  • Or if you want to do it right… web tier ow its own physical cluster of boxes, application tier on another set of boxes, and the database on yet another set of boxes.

Sophisticated Web sites have numerous logical layers on physically separate hardware (composed in layers).  So when creating enterprise software… you need to be cognizant of that.  The deployment scenarios can be quite varied.  So you need to design for it.  So for Gregari I used a layout seen in fairly large sites.  If you can do it … then your architecture is pretty solid.  So I designed for the worst case.

Deployment Architecture Layers - After having done several Internet applications that required fairly secure infrastructures the following deployement architecture layers were created, i.e.

  • Web Layer – This layer hosts any Internet facing Web Applications, i.e. external facing applications such as Web sites, Mobile Web sites, REST sites, etc.  Typically this just contains presentation logic with business logic being in lower layers. 
  • App Layer – This layer hosts any Intranet facing Web Applications, i.e. internal facing applications such as Web Services, Administration/Reporting sites, etc.  Typically this contains presentation logic for any Intranet applications as well as all the business logic. 
  • Data Layer – This layer hosts any database or other repository, i.e. database, LDAP, ESB (Enterprise Service Bus) document management, etc. 

Network connectivity would look like the following, i.e.

  • To the Web Layer (from the Internet browser) – Typically would only occur via HTTP/HTTPS, i.e. a Web browser on a desktop or a phone.  
  • To the App Layer (from the Web Layer) – Typically would either be a HTTP/HTTPS Web Service call or a remote call via Java RMI, etc.
  • To the App Layer (from a VPN connection) -  Typically would only occur via HTTP/HTTPS , i.e. a Web browser on a desktop or a phone (if connecting to an SMS aggregator).
  • To the Database Layer (from the App Layer) – Typically would occur via some repository specific protocol, i.e. communicating with a database could be via JDBC, etc.

Note that the Web Layer needs to go through the App Layer to access the Data Layer, i.e. the Web Layer can’t bypass the App Layer.  This promotes security via layers (presumably lot’s of firewalls, network intrusion detection, etc)

Application Architecture Layers – In considering an architecture for an Enterprise class application … considering the deployment topology is critical, i.e. within the Application Architecture … where do all the application layers go.

Currently the Gregari project has the following layers… (starting from the tier closest to the database)

  • Value objects – EJB3 Entity objects.  Retrieved by the DAO’s using the JPA (Java Persistence API) framework.
  • DAO (Data Access Objects) objects – Stateless EJB3 objects implementing javax.ejb.Local annotated interfaces.  These objects interact with the Java Persistence API classes and enforce SAAS (Software As A Service) multi-tenancy concepts.
  • ESB (Enterprise Service Bus) objects – Stateless EJB3 objects implementing javax.ejb.Local annotated interfaces.  These objects interact with ESB’s for handling any integration to the tenent, i.e. triggering business logic and/or persisting data if the tenent doesn’t want to persist it in a multi-tenent database or a shard. 
  • BO (Business Object) objects – Stateless EJB3 objects implementing javax.ejb.Local annotated  interfaces.  These objects contain the application business logic.
  • Observer objects – Stateless EJB3 objects implementing javax.ejb.Local annotated interfaces.  These objects observe events triggered by BO objects and/or JBoss Seam.  This allows a decoupled architecture where BO objects can emit events and other objects at runtime can react to those events without making spaghetti code.
  • Facade objects - Stateless EJB3 objects implementing javax.ejb.Remote annotated interfaces and javax.jws.Service annotated interfaces.  These objects would be called by clients not necessarily in the same JVM (Java Virtual Machine) as such they need to be remoteable, i.e. via Java RMI (via the Remote annotation) or via a Web Service (via the Service annotation).  These Facade objects need to bootstrap the JBoss Seam container hence they invoke any BO objects via the JBoss Seam “Component.getInstance( … )” call. 
  • Proxy objects - Stateless EJB3 objects implementing javax.ejb.Local annotated interfaces.  These objects are basically a proxy layer to the facade layer.  The theory is that the proxy layer is the client to the facades. The proxy’s might connect to the facade layer using local interfaces (if the facades are in the same JVM (Java Virtual Machine) as the client code), or remote interfaces if on physically separate boxes, or Web Services (if that’s preferred)
  • Action objects - Stateless EJB3 objects implementing javax.ejb.Local annotated interfaces.  These action objects are used by JBOSS SEAM (using JSF (Java Server Faces)).  These action objects call the proxy objects … which in turn call the facade layer (which may/may not) be on the same physical box.
  • REST objects – Stateless EJB3 objects implementing javax.ejb.Local annotated interfaces.  These REST objects expose the facade layers (by calling the proxy layer) via REST to developers across the Internet.  

When mapping the logical architecture to the deployment architecture, the following was done, i.e.

Web Tier (for any Internet facing content)

  • REST Objects
  • Action Objects (presumably with XHTML facelets)

Application Tier (for any internal facing content)

  • Facade Objects
  • Business Objects
  • Data Access Objects
  • Observer Objects
  • ESB Objects

Database Tier presumably will be abstracted away by the JDBC driver as well as any ORM component, i.e. Hibernate.

Then there’s objects/interfaces that are common across the Web and Application tiers, i.e.

  • Proxy Objects (so that any client of the facades… had code to call the facades)
  • Value Objects (objects mapped to database tables but potentially passed all the way up to the Action Objects.)
  • any interfaces that the Proxy might leverage… and the facade layer implements.  These interfaces need to be available to both the Web and the Application layers.

So from a build perspective there’s three separate build artifacts created, i.e. 

  • Client – Where the common objects/interfaces go, i.e. Proxy Objects, Value Objects, interfaces, etc.
  • Facade – Where the application layer objects go, i.e. Facade Objects, Business Objects, Data Access Objects, Observers, etc.
  • Site – Where the Web layer objects go, i.e. Action Objects, etc.

In subsequent blog’s I’ll go into more detail about each of the logical layers, i.e. basic responsibilities.

December 11, 2008

Writing Code vs. Generation (80/20 rule)

Filed under: Concepts — Mark De Lanoy @ 6:11 am

When I first started doing applications it was relatively simple, i.e. write some code that ran on your PC … that usually read stuff from file, i.e. a document.  Over the years somehow we’ve got more technically advanced and that’s made life complicated as a developer…  Now for creating Web applications you might need to do the following, i.e.

Web Content (the view)

  • HTML (XHTML)
  • CSS
  • JavaScript
  • Images

Web Controllers (the controller like Struts or JSF)

  • some form of controller code
  • some configuration (navigation rules)
  • some form of page assembly from page fragements (like Struts Tiles)
  • some form of validation
  • localized text

Web Services

  • XSD’s (XML Schemas)
  • WSDL (Web Services Definition Language)
  • WS-?
  • Java code (or…)

Business Logic

  • Code
  • Interfaces

Persistence

  • Database schema’s
  • Database mappings (Hibernate, etc)
  • Value objects
  • DAO (Data Access Objects)
  • Stored Procedures

On and on and on… Not to mention all the unit tests, mock objects, test data, documentation, etc.  There just has to be a better way.  

Frameworks to the Rescue…

Many frameworks have come along all with the goal of simplifying things.  There was spring (pre 2.0) that had alot of XML configuration in addition to relative simple code.  I did that for awhile but grew tired of the XML (especially Acegi).  I did experiment with XDOCLET tags within the Java code and various ANT/XDOCLET generation.  Worked great for awhile until XDOCLET was killed off in favor of XDOCLET2.  Still worked but XDOCLET2 wasn’t as complete as XDOCLET1 and didn’t seem to have alot of community interest.

JBOSS SEAM had been lurking for awhile.  I’d look at it and then go back to SPRING.  Six months later look at SEAM.  Eventually I got some books and liked how they leveraged Java5 annotations (in favor of XML configuration files).  Following an XP (extreme programming) concept where the code is self-documenting.. it became quite appealing to move to that framework.

Code Generation to the Rescue…

Frameworks in of themselves only get you partly there.  You still need to write a lot of code so what can be generated.  Some can be generated at build time, i.e.

  • Interfaces could be generated from classes (if the class and the method that should be exposed in the interface… are annotated somehow, i.e. XDOCLET tags, JAVA5 Annotations, etc.)
  • Configuration files could be generated from classes, i.e. the classes could state the default value, i.e. Hibernate Annotations on value objects is a great example.

Others can be generated at run time, i.e.

  • Web Services from EJB’s (or POJO’s) if using JEE Annotations.

So when starting new projects… I look at what I absolutely need to do versus what’s boiler plate code.  The boiler plate code I either try to generate (at build time) or via Annotations/Interceptors handle it at run time (or bury in an abstract class).  So ideally development focus’s on design, complex code (the 20% rule) and quality (unit tests, etc).  Leaving code generation, inheriting from abstract classes, and/or AOP as ways to reduce code bloat, reduce bugs, give more time to developers and QA to develop/test the cool things.

Annotation Overkill

The issue with code generation is that it needs something to generate off of… metadata.  Annotations (either XDOCLET (@@) or Java5 Annotations (@) are ideal but your classes, fields and methods could become festooned with annotations.  So pick your poison, i.e. lots of code/configuration files or lots of metadata throughout your code.  No easy answers so I just picked one, i.e. I’d rather have one file with code, javadoc, comments, and metadata … than lots of files.  

Slower Builds

More code generation results in slower builds so pick your poison, i.e. spend a lot of time writing code with really quick builds… or spend little/no time doing boiler plate/configuration while your CPU’s and the fire alarm become acquainted. 

Summary

So I picked metadata (first XDOCLET and subsequently Java5 annotations exclusively) and slower builds (they can be optimized).  Subsequent blogs will detail what’s written and what’s generated.

November, 2008 Sprint

Filed under: Ramblings, Sprints — Tags: , — Mark De Lanoy @ 5:26 am

The November sprint focused on:

  • Full Text Searching using Hibernate Search/Lucene
  • Caching using ehCache, i.e. marker annotations on methods, interceptors, tasks for flushing caches, etc
  • Merging Duplicates
  • Making Duplicates
  • Bulk Updates in objects, i.e. update multiple fields in selected objects simultaneously.
  • System Management, i.e. synthetic transactions, cache purge tasks, full text index purge/re-index tasks, performance statistics, etc.
  • Event triggering/listening using JBOSS Seam events/observers
  • Moving to JBOSS Seam’s security model

A lot was done and there should be some blog postings detailing the lot.

Theme: Silver is the New Black. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.