Monday, December 14, 2015

Don’t patch your database!

I imagine there was a lot of spilled brandy and lost monocles upon reading that title. Now that I have your attention, what I really meant was “Don’t patch your database without giving it careful thought.” I bet some of you are probably thinking, “Well of course that makes sense, why wouldn’t you do that?” That’s a really great question. It does make sense yet many people approach their patching plans without thinking through all the possible factors.

 “But wait,” you say. “I worked directly with Oracle to figure this out! They strongly recommended this patch. It has fixes that can help this problem.” Okay, that sounds great. Now tell me how applying this patch will affect the business?

A technically-oriented person would say that it fixes a problem that can only help the business. It’s a no-brainer to apply it. What they might not consider is that there’s often more than one fix in a patch. Sure you’ve got this one bit in there, but what about the rest of it? What other things does the patch touch? Are you sure that the improvements within are all beneficial (or at least benign) to the system?

An extension of that topic is “recommendation creep”. This is when you’re applying a patch with specific fixes, but there are strongly recommended security patches or other add-ons that can be applied at the same time. It might be very tempting to include these patches, but you have to ask yourself if they are truly going to add value right now. It doesn’t matter how “strongly recommended” a patch is – if it’s not adding value then it really is okay to leave it out.

All right, you’ve looked at the patches and determined that you’re not going to break anything. In fact, you took out a patch that doesn’t really matter right now. The next thing to look at is how the actual patch event will affect the business. When you are applying the patch, what is happening to the business?

I’ll use a real-world example for this. A company with performance issues decided to apply a database combo patch at the recommendation of a consulting company and the database vendor. The only problem was that applying the patch required 6 hours of downtime for their entire business. The company had 24/7 worldwide operations and could not find more than 4 contiguous hours even after shifting around numerous critical operations.

They asked for my advice on the patch, and I found that the patch they wanted to apply was a combo patch – three separate patches bundled together. In truth, only one of those three patches were required and it could be applied in a fashion that would result in zero downtime. In fact, they could still apply the entire combo patch in a slightly different order and reduce the total downtime to only one hour.

When you make the decision to patch the database, you’re not just fixing a problem. You could be introducing new problems as well, and not just with additional steps and fixes. Before you patch, ask yourself these questions:
  • Does this patch actually fix a problem? Don’t apply patches just because you hope it’ll fix something.
  • What else does this patch do? Don’t inadvertently apply something that changes something critical.
  • Are there parts of this patch that aren’t really needed? It might be nice to add on that security patch, but don’t change too much at once.
  • How does the patch process affect the business? Always try to reduce downtime even if it makes the process itself a little more complex.

Remember this: In the end, you are the one applying the patch. No matter how strongly someone recommends you do something, you are the decision maker. You really don’t have to apply a patch just because it’s on a list somewhere. Look at the patch, study it, test it, and know exactly how it’s going to affect your system and your business before you drop it in.

Wednesday, November 11, 2015

Oracle Standard Edition is changing in December

You probably already know that Oracle includes the Standard Edition (SE) of their database as part of JDE E1 licenses. Currently, Oracle SE allows customers to use four CPU sockets and unlimited cores. Oracle Real Application Clusters (RAC) is free with SE as long as you don’t have more than four sockets in the cluster. While SE does not have many of the advanced features that Enterprise Edition (EE) has, most JDE E1 clients find that it’s more than sufficient to power their enterprise.

This changes in December when the latest Oracle 12.1.0.2 patch is released. Oracle SE is being replaced by a new product called Standard Edition 2 (SE2), and with it comes many important differences. The biggest is that SE2 will be limited to two sockets and 16 CPU threads (cores). Oracle RAC is still available, but you cannot exceed the total number of sockets or cores. That means a two-node RAC on SE2 can have 1 CPU socket and 8 cores per node. Existing installations of 12c SE will go on free extended support until Dec 2016 (all versions of 11g go off support completely in January).

Despite this major change, using the included Oracle 12c SE2 for your JDE E1 database is still far cheaper than any other alternative for a standalone DB server. Many clients don’t use more 16 cores on their databases, so this will probably not affect the majority of JDE E1 installations.

That’s not to say that nobody uses more than 16 cores. In those cases, upgrading existing databases to 12.1.0.2 will require clients to change their configuration to match the new licensing requirements. If you’re not ready to downgrade your specs, another option is to remain on the current 12c Standard Edition indefinitely. Support may end in December 2016, but you can still keep using the database.

If you have too much activity to scale back the number of CPUs, consider splitting the database in two. E1 allows you to move any part of your data to any database and have it all be transparent to the application. Simply add a new Oracle SE2 database, then move some of the data over to that new database. You could, for example, put all the system databases (system, server map, object librarian, data dictionary) on the new database server and leave production data on the original machine.

The last Oracle option is the most expensive but is also the most flexible. Oracle Enterprise Edition (EE) allows unlimited sockets and cores, plus you have access to all the advanced features available for the Oracle database. If you’re already on EE then all this talk about changes to SE doesn’t concern you at all.

References
Oracle database edition comparison

Oracle SE2 product page

Oracle Lifetime Support Policy

Wednesday, October 7, 2015

When was the last time you practiced recovery of your production database?

Are you prepared to recover your database in the event of a disaster? Go ahead and pull up your disaster recovery plans and let’s go over them now. I’ll wait. Oh, you don’t have it written down? Hmmm.

You’d be very surprised how many companies don’t have a disaster plan at all, much less a plan that is written down in some way. Most that do have a plan are dependent on a single person to execute it. Not only that, the person in question has the entire plan in their head.

Here’s a scenario: Your DBA has always wanted to climb Mt. Everest, and they’re going to be gone for two weeks with no contact to society at all. Two days after they leave you get a disk failure on your primary storage. No problem, you think, we have redundancies. Your vendor swaps out the disk, but something is wrong: The data has magically disappeared from the redundant disks.

Sound far-fetched? Not really. Even if you’re hosted by Amazon AWS, you can suddenly lose all your data with no warning or explanation. What do you do now? It’s a disaster, time to recover.
The key part of a disaster recovery operation is a good backup. That’s a no-brainer. How often you back up and what kinds of backups is up to your business. How much data can your company lose? 24 hours? 2 hours? Ten seconds? That will drive your backup strategy.

Now that you have a backup, what is your Disaster Recovery plan? Don’t tell me it’s just “restore the backup”. What about third-party software? Cron jobs? Config entries? All those little bits and pieces you’ve cultivated over the years could be gone. The time to figure out what those were is not when the C-levels are breathing over your shoulder counting the money being lost while their systems are idle.

Practice your DR plan. The plan is no good if it can’t be executed successfully. Maybe you’ve written down a comprehensive plan, but step 17 is a reference to somebody’s blog that was taken down long ago. You’re definitely going to want to have it all local and all spelled out. You might want to go as far as to print it out and store it in a bank deposit box. If your disaster is that the building was destroyed by a tornado, the easiest part is going to be buying new hardware. The hard part is recreating your DR plan from scratch.

Don’t forget to back up your backup scripts and DR plans. It may not be that hard to recreate a backup plan, but for the poor schlep left behind while you’re climbing Everest - it could be impossible.


If you don’t have a Disaster Recovery plan right now, you can create it immediately. Open up a blank document, title it “Disaster Recovery Plan”, then type out the sentence “I do not have a disaster recovery plan.” Email that to your boss and file it where everyone can find it. At least they will know you are prepared. 

Monday, September 21, 2015

How to run SQL on the deployment server local Oracle database.

Have you ever needed to connect directly to the local Oracle Planner database on the deployment server? Maybe you need to change server names or update the ESU list. It's actually pretty easy when you remember one simple step.

But first - Did you know that the planner database is a full version of Oracle Enterprise Edition (EE)? Yes, it's true. It does have a significant restriction though: You can only use it for the E1 planner database. If you want to put anything on that database that is not the planner metadata you will have to pay $47,000 for a full EE license.

Now, how do you get logged in to the local Planner database to run some SQL? To get started, let's try logging in to a planner DB without doing anything special. Open a command window and enter the following command:

sqlplus / as sysdba


In this example, the ORA-01031 error means you are not in the Windows “Oracle DBA Group”. To fix this, you need to add your user to the group. Open up Server Manager and go to Configuration/Local Users and Groups/Groups.



Find the group ora_dba and double click it:


In there click Add, find your name, and click OK. You’ll see your name in the Members: list now. Click OK


Close out of the command window you were using before and open a new one. Try to connect to sqlplus again:


There you go! You’re now connected to the planner database.

Thursday, August 13, 2015

Table operations with OMW Part 1 - Copy table structures

I'm not sure why this is, but a lot of people don't know that with the E1 Object Management Workbench (OMW) they can create new tables and make copies of existing tables then generate them in any data source. With OMW you can also copy a table and its data from one data source to another. For existing tables you can regenerate them (which removes all data) and also recreate all the indexes.

This first blog on OMW table operations will show you how to copy the structure of a table. You might need a copy of table to use as a work table or part of a custom project. To keep things short, I have to assume you already know the basics of using OMW including how to create projects and add users to a project.

Copy table structure
Go in to OMW and move a table into your project. In this example we'll use the F0010. Using the search screen, find the F0010 and add it to your project.

Finding an object in OMW

The F0010 in your project

If you just want a copy of the table structure, all you have to do now is highlight the table in your project and click the Copy button at the top. You'll now see the Copy Object screen. Remember - you're not copying data, only the table structure.



In accordance with best practices, I have named my new table F550010W and changed the description. I've also set the product and product system codes to 55. The rest is fine to leave as-is. Click OK when you are satisfied.

You're now returned to the Table Design screen. In there click the Table Operations tab. There's a lot of buttons in there, but for now we're only concerned with the Generate Table button.



Clicking the Generate Table button takes you to the Generate Table window. Here you can select the data source and the object owner password.


Since I have created a new table name I will leave the data source alone. I won't modify the password because OMW will automatically fill in the correct one if you haven't changed the defaults. Click the OK button when you're ready to create the table.


Success! Your table has been created in the Business Data - Test data source. Click all the OK buttons you can until you are back at the main OMW window. When you find your project again the new table will be listed.


You don't need to check it in or build a package to start using the table for a few things. You can open the table in UTB and even go directly to the database and work on it there if you wish. Check in the table when you want to promote and build it or simply to make the specs available to other developers,

In part two I'll tell you how to delete that table (HINT: Click the delete button) and I'll demonstrate how to copy a table and all of its data to a new location.

Friday, July 31, 2015

ORACLE 11g IS DEAD! LONG LIVE 12c!

At the end of this year, the Oracle 11.2.0.4 database goes on Extended Support. 11.2.0.3 hits extended support on August 27th - less than one month away! 11.2.0.1 and 2 are completely out of support.

For 11.2.0.3 and .4, clients cannot get support or bug fixes after Extended Support ends unless they purchase additional coverage. There is no such option for lower releases.

That's not all. Oracle 12c is only supported as of the 9.1.4 E1 tools release. If a client is on 11g and wishes to go to 12c, they also have to be at that TR or higher.

I know many companies are running 11g and don't know that the database is going off of support. For the ones that do know of their impending doom, some are making plans to upgrade without considering the tools release they're currently on.

Are you going to have problems if you don't upgrade all the things? I've got good money on "no". My theory on upgrades and patches is that you should only do it if you have a need. A need for new functionality or a need to patch something. Otherwise if everything is running fine why go through the trouble?

Still, most companies don't like the idea of running out-of-date or unsupported software. With the various deadlines approaching fast, now is a good time to sell some upgrades. Whether you're selling to a client or just to your own boss, you better get on it before the end of support. Better to forewarn people than trying to explain why you didn't say anything prior to the deadline.

Wednesday, July 29, 2015

I was just doing an environment refresh and the database crashed!


Business as usual
At the end of the day, you get asked to refresh PY business data from production. No problem, you think. You'll kick it off, let it run, then the next morning it'll be good to go. After all, nobody uses the non-prod environments after hours so it's not going to matter if they get a bit wonky while the refresh is in progress.

When things hit fans
Around 8pm you get a call: JDE is down! You get off the couch and log in to the system. Sure enough, you try to run something inside E1 but you get an error like this:



If you were to look in the server jde.logs you'd see this:

3956/6844 MAIN_THREAD                            Fri Jun 05 12:17:20.351000            Jdb_ctl.c4091
            Starting OneWorld

3956/6844 MAIN_THREAD                            Fri Jun 05 12:18:04.439000            dbinitcn.c899
            OCI0000065 - Unable to create user session to database server

3956/6844 MAIN_THREAD                            Fri Jun 05 12:18:04.439001            dbinitcn.c904
            OCI0000141 - Error - ORA-00257: archiver error. Connect internal only, until freed.
3956/6844 MAIN_THREAD                            Fri Jun 05 12:18:04.439002            dbinitcn.c520
            OCI0000367 - Unable to connect to Oracle ORA-00257: archiver error. Connect internal only, until freed.
3956/6844 MAIN_THREAD                            Fri Jun 05 12:18:04.439003            Jdb_drvm.c798
            JDB9900164 - Failed to connect to ovsorcl


Maybe you're an experienced or brave CNC and you know how to log in to the database via SQL*Plus. You'd be greeted by this message:


The first screenshot from E1 is the most descriptive message you will get, but it's not very understandable unless you are an experienced DBA. What can add to the confusion is when the server has plenty of free space available.

Everyone is yelling at me to fix it
First, just stop everything you and everyone else are doing. You're probably hearing people throwing out ideas that range from good to crazy - but they're all wrong. Most people's first reaction is to restart E1, the database, or reboot all the servers. When that doesn't work you'll start deleting files on the database server and you might even figure out where the archive logs are and delete those. Unfortunately, none of those actions are going to fix the problem. 

I'm so confused
To understand what's going on and how to fix it, you have to know a little bit about how the Oracle database works. Whenever you do something inside E1 via an application or a UBE it sends a SQL statement to the database. It could be requesting data, updating rows, or inserting new records. Every single action E1 performs on the database is recorded into the redo log files. 

Those redo logs collect every SQL statement executed and are used to recover the database. When a redo log gets full, it writes its contents to another file called an archive log. There are only a handful of redo logs and they get overwritten frequently. Archive logs are continuously created and can be backed up, then later used to restore your database to any point in the past.

So what's the problem?
Obviously if you run out of disk space you're going to have issues with any piece of software. It's the same with a database. If the database can't write any new archive logs (for example), then it can't recover from a failure. When archiving stops, the database stops.

But I have free disk space!
This is entirely possible. Space for archive logs is allocated from within Oracle. You can have 200 GB of space available, but if you've only set aside 5 GB for archive logging inside Oracle you're never going to max out your disk.

Okay, enough talk! How do I fix this?
The easiest, fastest way to get your database going is to expand the space Oracle uses for the archive logs - IF you have extra disk space. First, you need to find out where your archive logs are going. An example default destination is /u01/app/oracle/fast_recovery_area/ORCL/archivelog. Look in there (or similar) and if you see a bunch of directories named in the style of YYYY_MM_DD then you've struck gold. If you can't find them you're going to have to look for them.

If SQL*Plus isn't letting you connect due to the archiver error, you can connect like so:

sqlplus /nolog 
SQL> conn sys/ as sysdba 

Once connected you can run the following SQL statement to find out where they're going:

SQL> show parameter db_recovery_file_dest

Wherever the log files are going, find out how much room is left on the disk. Let's say that on your /u01 mount you have 50GB free and you want to expand the archive logs into that. The next thing you need to do is see how much space Oracle thinks it has for the logs.

SQL> show parameter db_recovery_file_dest_size

We'll pretend this returns a value of 20GB. That means we can increase the allocated space to 70GB which will use up all available space on the disk. Now all you have to do is tell Oracle to use it:

SQL> alter system set db_recovery_file_dest_size = 70GB scope=BOTH

As soon as you execute this statement your database will start working again. No need for restarts or reboots, the DB just goes on about its business.

My archive log destination is as big as the disk
Well now you're in a real pickle. Still, there is a simple solution - do a full backup including archive logs. I'm not going to go in to every possible way to back up a database, but here is a simple command to clear out archive logs using RMAN:

RMAN> BACKUP ARCHIVELOG ALL DELETE ALL INPUT;

When it's finished you'll have a backup file and all the archive logs will be gone. You'll have some space freed up, but it'll be a good idea to move that backup file off of the server to open it up even more.

Backup takes too long/goes to same disk as archive logs/I have nowhere to put it.
Back in the pickle again I see. Your next best option is pretty simple but also the most dangerous: Delete the archive logs. This frees up space but you negatively affect (or completely eliminate) your ability to restore your database to any point after the last backup. However, on a non-production system or even during a production emergency this might be the best option.

First thing to do is find your archive log destination. Use the same instructions from above (show parameter db_recovery_file_dest) then delete the directories and files in the archive log destination. To finish up you have to tell RMAN the files are gone:

rman target /
crosscheck archivelog all;
delete noprompt expired archivelog all;
exit

That connects you to RMAN, compares the list of archive log files Oracle knows about to what actually exists, then deletes the entries for any that don't match up. As soon as you finish the RMAN statements your database will resume on its own. There is nothing more you should have to do for operations to continue. Now you have free space on your disk and within Oracle. 

The horrible, horrible catch when deleting archive logs
You've broken your backups. If the database were to fail due to data corruption, power failure, tornado, or a giant monster attack then you're going to have some very unhappy executives. You need to immediately take a full backup of your database to un-break it.

How do we stop this from happening again?
  1. Make sure your database has enough space. Not just for archive logs, but also for your data files and everything else. Figure out how much everything grows on a day-to-day basis, then bust out your calculator and find out where it's going over the next 6-12 months.
  2. Be prepared for large data load operations. If you're space constrained you probably need to take precautions before doing a refresh. Your users need to let you know before they do any large data inserts.
  3. Set up database and server monitors to warn you well in advance. If you can be alerted to the space issues in enough time you can fix everything before anybody knows there is a problem.

Reality check
I kind of glossed over some details and skipped some advanced techniques so I could keep things relatively simple while also providing solutions almost anyone could use. Having a database halt is a very stressful situation and it can be difficult to troubleshoot especially if you're not a DBA. The actions I talked about in this post should only be followed by a non-DBA in an extreme emergency. There is too much that could go wrong and you don't want to make a bad situation worse. 

Proceed with caution.

Friday, July 24, 2015

The shocking truth: You can't create a read-only user on your E1 database!

Did you know that by default all users created on Oracle and MSSQL databases have full permission on all JDE tables? It's true. Not only that, this permission takes precedence over any restrictions you might place on a particular DB user such as making them read-only. This is hard-coded in to EnterpriseOne and Oracle has no plans to change this behavior.

Why is this? JDE is designed to insulate users and administrators from the hardware and database. This is part of the spirit of CNC - Configurable Network Computing. Yes, CNC actually means something other than "Person Who Gets Blamed For Everything". The CNC concept means that with only JDE tools a user can perform select, insert, update, delete, create, and drop operations on any table.

Because you don't have to know what database or operating system is being used, you don't need to have any specific database skills. Without ever leaving EnterpriseOne you can do full environment refreshes, table copies, add new indexes, and much more. The best part is that it is all controlled by JDE security. You never have to touch the database to do your day-to-day job.

As flexible as that is, there are often good reasons to directly access the database. One example is to do faster refreshes. In the middle ground are SQL fixes due to data issues. On the other end are bad reasons which (in my opinion) include providing read-only access and large exports of raw data. I'll talk about why those are bad in another post.

No matter what the justification is for providing direct access, the shocking truth is that every database user can view and manipulate all JDE tables and data on the system - and there is nothing you can to do change that.

I take that back. There actually is something you can do to change it. It's called Oracle Public Shutdown (OPS), a somewhat complicated procedure that removes these global permissions and allows administrators to have better security control at the database level. Once you've run OPS you can now create read-only users as well as any other levels of access you can think of.

Unfortunately for your administrative workload, OPS is not permanent. Every time a table is created or regenerated in OMW, every user gets full permissions reassigned to that object. Every ESU, ASU, and Update that contains tables will reset those permissions. An upgrade will reset many permissions. What this all means is that you have to come back to your database every so often and re-run OPS.

The next time someone asks for a database user, make sure there is a really good justification for their access. You'd be surprised how often they can do what they want inside E1. Plus, you don't want to give what is essentially DBA authority to just any user. Related tip: Make sure you have good backups.

If your company is concerned about the security of their database - especially if they have SOX or FDA requirements - they may need Oracle Public Shutdown. Until then: Be careful. Leverage the tools included within EnterpriseOne wherever possible.

Reference: Oracle Public Shutdown