2015-11-26

handling disks for ASM - when DB, Linux and Storage admins work together

A proper management of ASM Disks can be a complicated task.

On DOAG2015 I discussed with Martin Bach about the concept in my current company, where we implemented a setting which is consistent, robust and enables Storage, Linux and DB admins to work together easily.

As we started to think about ASM when 10.1 was out we tried to evaluate our possibility. asmlib was discarded quite early as it only increased complexity without additional value: We have a SAN (fibre channel) infrastructure with 2 separated fabrics. So a multipath configuration is needed. ASM (or asmlib)  can not handle this, so a proper multipath configuration is needed at all. Also asmlib hides storage details from DBAs/ASM-Admins, where we wanted to enable every person involved know as many details as possible easily.

We also saw ASM sometimes takes a long time to scan for new disks (every time you access v$asm_disks - so use v$asm_disk_stat as this view does not rescan but only shows infos about devices it has in SGA already) if there are many "files" (devices) in asm_diskstring.

asm_diskstring

We set asm_diskstring to a dedicated directory. In our case it's called /appl/oracle/asm_disks/* This speeds up a rescan of all "disks", it's also a clear indicator of all disks ASM uses. There we have symlinks to devices in /dev/mapper/

symlinks

The symlink has this format:
/appl/oracle/asm_disks/360060e80167bd70000017bd700000007p1_p9500_b52_MONIQP01_000 -> /dev/mapper/360060e80167bd70000017bd700000007p1

Some informations about all the values we stored there:
360060e80167bd70000017bd700000007p1 is the WWN of the disk, together with it's partition (p1).  The WWN is very useful in every discussion with Storage Admins, as it identifies the LUN from their perspective. We decided to partition the disks. It's shown in our records that Linux Admins touches un-formatted devices more often than devices which are formatted already. There were also some cases in early tests when the first block of a disk was cached by the kernel. Both issues are addressed when we format every disk. If required partitioning can help do adapt alignments.
p9500 is a shortname which identifies the Storage box with a name we can use during discussions. It's somewhere within the WWN as well. So it's a pure redundancy. But it makes discussions much easier.
b52 is a shortname to identify the datacenter. As pur fabrics are spawned across several datacenters, sometimes it's nice to have a fast knowledge about the topology.
MONIQP01_000 is the label used in some Storage boxes. It contains the Diskgroup name and some number.  At the moment it's NOT the NAME of an ASM-disk, but this can be introduced easily.

As the name of a diskgroup is coded into our naming schema, it's not accepted to reuse a disk for some other diskgroup. (Technically it's still possible, we just agreed not to do so). Even it seems this limits the DBAs flexibility, there are good reasons to do so. Disks are sometimes created with dedicated settings/parameters for a special purpose. Reusing such disks in other DGs would cause strange and hard to find performance symptoms. So If disks are not needed anymore we always "destroy" them and re-create new if needed.

udev rules

Our udev ruleset on RedHat6 is quite simple:
the file /etc/udev/rules.d/41-multipath.rules contains such lines:
ACTION=="add|change", ENV{DM_NAME}=="360060e80167bd70000017bd700000007p1", OWNER:="oracle", MODE:="0660", GROUP:="asmadmin"
We do not do any mapping of names here - it's only there to set permissions.

multipath

The config in /etc/multipath.conf is quite simple, only parameters required for every specific storage vendor / product.


I can not say a lot about configurations outside if the Linux server, so both SAN fabrics and the storage system are "just working".




2015-11-21

anatomy of DBA_FEATURE_USAGE_STATISTICS

In Oracle database the vie DBA_FEATURE_USAGE_STATISTICS is given to
display information about database feature usage statistics.
That's all I can find in The (12.1) Docu.

Just in case anyone asks - DBA_FEATURE_USAGE_STATISTICS is not mentioned in the license guide at all:

Still for me it's interesting what's behind the view, and how it's content is populated.

So I started to analyze it's anatomy.
Starting with the view definition (slightly rewritten for better readability):
SELECT samp.dbid,      
       fu.name,  
       samp.version,   
       fu.detected_usages, 
       samp.total_samples, 
       Decode(To_char( last_usage_date,  'MM/DD/YYYY, HH:MI:SS'), 
              NULL, 'FALSE', 
              To_char( last_sample_date, 'MM/DD/YYYY, HH:MI:SS'), 
              'TRUE', 'FALSE') currently_used, 
       fu.first_usage_date, 
       fu.last_usage_date, 
       fu.aux_count, 
       fu.feature_info, 
       samp.last_sample_date, 
       lsamp.ast_sample_period, 
       samp.sample_interval, 
       mt.description 
FROM   wri$_dbu_usage_sample samp, 
       wri$_dbu_feature_usage fu, 
       wri$_dbu_feature_metadata mt 
WHERE  samp.dbid = fu.dbid 
       AND samp.version = fu.version 
       AND fu.name = mt.name 
       AND fu.name NOT LIKE '_DBFUS_TEST%'    /* filter out test features */ 
       AND Bitand(mt.usg_det_method, 4) != 4  /* filter out disabled features */

This view is quite simple, and all important columns of wri$_dbu_usage_sample and wri$_dbu_feature_usage are shown in the projection.
The filters are only to hide some rows used for internal testing.

So the interesting part are those columns of wri$_dbu_feature_meatadata which are not shown.
In this table are 2 groups of columns:
The first group are the columns INST_CHK_METHOD and INST_CHK_LOGIC.
They describe if and how to check, if a detailed detection is needed at all.
INST_CHECK_METHOD has 3 possible values right now:
DBU_INST_ALWAYS_INSTALLED1/* no test required */
DBU_INST_OBJECT/* object name in LOGIC */
DBU_INST_TEST4/* only test */
DBU_INST_ALWAYS_INSTALLED means the test is run always. If DBU_INST_OBJECT it's only tested, if the object in INST_CHK_LOGIC exists.

The detection itself is described in the columns USG_DET_METHOD and USG_DET_LOGIC.
USG_DET_METHOD has 3 possible values:
DBU_DETECT_BY_SQL1/* SQL specified in USG_DET_LOGIC */
DBU_DETECT_BY_PROCEDURE/* Procedure specified in USG_DET_LOGIC */
DBU_DETECT_NULL4/* only test for _DBFUS_TEST_SQL_7 */

If it's set to DBU_DETECT_BY_SQL, the SQL stored in USG_DET_LOGIC is executed and it's result stored. In case it's DBU_DETECT_BY_PROCEDURE, the procedure is called. It has 3 parameters (with different names, but the type is always the same):
Argument Name Type   In/Out Default 
------------- ------ ------ ------- 
IS_USED       NUMBER OUT            
DATA_RATIO    NUMBER OUT            
CLOB_REST     CLOB   OUT    

DATA_RATIO will be stored in AUX_COUNT, CLOB_REST in FEATURE_INFO.

These infos should be sufficient to analyze everything inside and related to DBA_FEATURE_USAGE_STATISTICS.



During the research I found 2 ways to do a immediate run of all tests. Instead of the default weekly run. Those update wri$_dbu_feature_usage - so the results are stored.

The first one is to kick mmon to run the usage sampling out of schedule:
alter session set events 'immediate trace name mmon_test level 6';

The second method is to run the sampling in the current session. This is quite nice for any kind of tracing:
alter session set "_SWRF_TEST_ACTION"=53;


But be aware both methods can create a higher DETECTED_USAGES than weeks between FIRST_USAGE_DATE and LAST_USAGE_DATE.



For a proper mapping between entries in DBA_FEATURE_USAGE_STATISTICS and options/packs there is a script options_packs_usage_statistics.sql provided by My Oracle Support in Database Options/Management Packs Usage Reporting for Oracle Databases 11gR2 and 12c (Doc ID 1317265.1)