Web Content Display Installing Webmin in Ubuntu Hardy Heron
Web Content Display Data migration ETL Services Design and Deploy of Business Intelligence Servers Design and Deploy of Virtualisation Services Instant Deploy of apps server services We have pre-built apps servers (CRM,CMS, email/db/web, firewall, BI, data-migration etc.) which can be deployed in minutes! Our platforms are Vmware ESX or Amazon Elastic Cloud (AEC). For Business Intelligence (BI) we use the Pentaho suite.The Pentaho BI Platform is a process-centric, workflow-enabled, scalable platform for solving Business Intelligence problems. A direct quote: "The Pentaho BI Project provides enterprise-class reporting, analysis, dashboard, data mining and workflow capabilities that help organizations operate more efficiently and effectively. The software offers flexible deployment options that enable use as embeddable components, customized BI application solutions, and as a complete out-of-the-box, integrated BI platform. Ranked #1 in Open Source Business Intelligence." from their site http://www.pentaho.com/ Virtualisation services is primarily the free VMware server. We also have experience with VMware from desktop to ESX thus reducing Server deployment costs through server consolidation. Also reduce testing time and cost by using the test system virtualisation use of the VMware snap-shot capability. We manage hosted Oracle, SAPdb and MYsql Postgres databases and have extensive migration experience between them. Data ETL and Replication. We use the Pentaho toolset to effortlessly migrate data from anywhere to anywhere, between ANY databes, platform and file system. Enterprise class Document Management is achieved via Alfresco (a Documentum replacement), for smaller lighter needs we use Knowledgetree (DMS) and Plone (web CMS). Both can be 'rapidly deployed' as an appliance. Quick HOWTO change Mysql default data directory By default, MySQL’s datadir is placed in the /var/lib/mysql directory. Create the directory that will be new datadir (ex:-/home/db) chown the directory to the mysql:mysql user sudo chown -R mysql:mysql /home/db/* You need to stop the mysql server using the following command sudo /etc/init.d/mysql stop Now you need to edit the /etc/mysql/my.cnf file sudo vi /etc/mysql/my.cnf and look for “datadir = /var/lib/mysql” this si where mysql database default data directory here you need to change this one to your new directory datadir = /home/db copy the files from the old datadir to the new location. However, make sure that the files named Make sure that the files and directories are owned by mysql user Make changes in the my.cnf to point the new datadir. Restart the MySQL database sudo /etc/init.d/mysql start I'm always looking and can never seem to find this list AMI's on the amazon site. So here it is: |
Web Content Display Do You Find Gmail web client slow?Use Thunderbird 2.0 and get dramatically improved performance.Follow the instructions below to set up IMAP access in Thunderbird 2.0Note: if you're using an older version of Thunderbird, we highly recommend upgrading to version 2.0. To set up your Thunderbird client to work with Gmail:
To ensure that Gmail can communicate with your mail client, be sure to click Save Changeson the Forwarding and POP/IMAPsettings tab. Data Pump in Oracle Database 10g: Foundation for Ultra-High Speed Data Movement Utilities Data Pump Overview Data Pump is a new facility in the Oracle Database 10g Server that enables very high speed data and metadata loading and unloading to / from the Oracle Database. It automatically manages and schedules multiple, parallel streams of load or unload for maximum throughput. Data Pump infrastructure is callable via the PL/SQL package DBMS_DATAPUMP. Thus, custom data movement utilities can be built using Data Pump. Oracle Database 10g will see four of these: New command line export and import clients (expdp & impdp), a web-based Enterprise Manager export / import interface and a custom interface for the movement of complex Data Mining models. Data Pump is also the foundation for several other key features in the Oracle server: Streams-based Replication, Logical Standby, Grid and Transportable Tablespaces. For Streams-based Replication and Logical Standby, Data Pump quickly generates the initial configuration at the replicated site using Flashback technology to provide a consistent starting point. Initial Grid instantiation is based on Transportable Tablespaces which relies on Data Pump to move and “hook-up” the metadata for the objects defined in the transported tablespace set. When gathering requirements for Data Pump with large customers, we repeatedly heard, “Time is money. If you do nothing else, make the export / import operation much faster for large amounts of data”. We took that to heart. Data Pump dramatically decreases the elapsed time for most large export / import operations; in some data-intensive cases, by more than two orders of magnitude. In addition to increased performance, our customers enumerated many other requirements. As a result, Data Pump-based export and import clients support all the functionality of the original exp/imp as well as many new features such as checkpoint restart, job size estimation, very flexible, fine-grained object selection, direct loading of one instance from another and detailed job monitoring. Data Pump is an integral feature of Oracle Database 10g and therefore is available in all configurations. However, parallelism with a degree greater than one is only available in Enterprise Edition. This paper will start with an overview of Data Pump's architecture then describe its main features along with some best-use practices, then provide a brief comparison to the original export and import facilities. Architecture Master Table At the heart of every Data Pump operation is the “Master Table” (MT). This is a table created in the schema of the user running a Data Pump job. It is a directory which maintains all facets about the job: The current state of every object exported or imported and their locations in the dumpfile set, the job's user-supplied parameters, the status of every worker process, the current set of dumpfiles, restart information, etc. The MT is built during a file-based export job and is written to the dumpfile set as the last step. Conversely, loading the MT into the current user's schema is the first step of a file-based import operation and is used to sequence the creation of all objects imported. During export, note that the MT cannot span files (as other written objects can). If dumpfile sizes are being limited by the FILESIZE parameter, this must be large enough to contain the master table. As a reference point, a database with approximately 400,000 objects including 10,000 tables creates a master table 189 Mb in size. The MT is the key to Data Pump's restart capability in the event of a planned or unplanned job stoppage. Since it maintains the status of every object comprising the job, upon restarting, Data Pump knows what objects were currently being worked on and if they completed successfully or not. Process Structure There are various processes that comprise a Data Pump job. They are described in the order of their creation. Client Process - This is the process that makes calls to Data Pump's API. As mentioned earlier, Oracle Database 10g ships four clients of this API. This paper will only discuss the new export / import clients, expdp and impdp. They have a very similar look and feel to the original exp and imp, but have far more capabilities as will be described later. Since Data Pump is completely integrated into the Oracle Database server, a client is not required once a job is underway. Multiple clients may attach and detach from a job as necessary for monitoring and control. Shadow Process - This is the standard Oracle shadow (or foreground) process created when a client logs in to the Oracle server. The shadow services Data Pump API requests.1Upon receipt of a DBMS_DATAPUMP.OPEN request, the shadow creates the job which primarily consists of creating the master table, creating the AQ queues used for communication among the various processes and creating the Master Control Process. Once a job is running, the shadow's main task usually consists of servicing GET_STATUS requests from the client. If the client detaches, the shadow goes away, too. Master Control Process (MCP) - There is one MCP per job and as the name implies, the MCP controls the execution and sequencing of a Data Pump job. A job is divided into various metadata and data unloading or loading phases, and the MCP hands out work requests to the worker processes appropriate for the current phase. It mainly sits in this work dispatch loop during a job's execution. It also performs central file management duties, maintaining the active dump file list and handing out file pieces as requested by processes unloading data or metadata. An MCP has a process name of the form: <instance>_DMnn_<pid>. The MCP maintains job state, job description, restart and dumpfile information in the master table. Worker Process - Upon receipt of a START_JOB request, the MCP creates N worker processes where N is the value of the PARALLEL parameter. The workers perform the tasks requested by the MCP that consist primarily of unloading / loading of metadata and data. The workers maintain the object rows in the master table which comprise the bulk of the table. As database objects are unloaded or loaded, these rows are written and updated with the objects' current status: pending, completed, failed, etc. The workers also maintain what are called “type completion rows” which describe what type of object is currently being worked on: tables, indexes, views, etc. These are used during restart. A worker has a name of the form: <instance>_DWnn_<pid>. Parallel Query (PQ) Process - If External Tables (ET) is chosen as the data access method for loading or unloading a table or partition, N PQ processes are created by the worker given the load or unload assignment, and the worker acts as the query coordinator. These are standard parallel execution slaves exploiting the Oracle server's parallel execution architecture, thus enabling intra-partition loading and unloading. In a RAC, PQ processes may be created on an instance other than where the Data Pump job was initiated. All other processes described thus far are created on that initial instance. Data Movement Two access methods are supported: Direct Path (DP) and External Tables (ET). DP is the faster of the two but does not support intra-partition parallelism. ET does and therefore may be chosen to load or unload a very large table or partition. Each also has certain restrictions requiring the use of the other: For example, a table being loaded with active referential constraints or global indexes must be loaded via ET. A table with a column of data type LONG must be unloaded and loaded with DP. Both methods write to the dumpfile set in a compact, binary stream format that is approximately 15% smaller than original export's data representation. Metadata Movement The Metadata API (DBMS_METADATA) is used by worker processes for all metadata unloading and loading. Database object definitions are extracted and written to the dumpfile set as XML documents rather than as SQL DDL as was done by original export. This allows great flexibility to apply XSL-T transformations when creating the DDL at import time. For example, an object's ownership, storage characteristics and tablespace residence can be changed easily during import. Necessarily, XML takes up more dumpfile space than DDL, but the flexibility gained by deferring DDL creation until import is worth the trade-off. Since metadata is typically dwarfed by data in a dumpfile set, the reduction in binary stream size typically produces dumpfile sets the same as or smaller than those produced by original export. One other small point: The XML representing table metadata is stored within the MT rather than being written immediately to the dumpfile set. This is because it is used for several downstream operations, thus avoiding multiple fetches. Inter-process Communication Advanced Queuing (AQ) is used for communicating among the various Data Pump processes. Each job has two queues:
These queues have names of the form: 'KUPC${C|S}_<job-unique timestamp> File Management The file manager is a distributed component: As mentioned earlier, the actual creation of new files and allocation of file segments is handled centrally within the MCP. However, each worker and PQ process make local process requests to the file manager to request space, read a file chunk, write a buffer or update progress statistics. The local file manager determines if the request can be handled locally and if not, forwards it via the command and control queue to the MCP. Reading file chunks and updating file statistics in the master table are handled locally. Writing a buffer is typically handled locally, but may result in a request to the MCP for more file space. Directory Management Because all dumpfile set I/O is handled by Oracle background server processes, the operating system persona doing the I/O is “oracle”, not the user running the job. This presents a security dilemma since “oracle” is typically a privileged account. Therefore, all directory specifications are made using Oracle directory objects with read / write grants established by the DBA. For example, the DBA may set up a directory as follows: Create directory dmpdir1 as '/private1/data/dumps'; Grant read, write on directory dmpdir1 to scott; Then SCOTT can specify a dumpfile on the expdp command line as: expdp scott/tiger dumpfile=dmpdir1:scott.dmp If the file size is limited for manageability by the FILESIZE parameter, then potentially many dump files can be created. The file manager automatically maintains dumpfile set coherency via a globally unique identifier and other information written into the file headers. An import or SQL file job cannot start until all members of the job's dumpfile set are present in DUMPFILE parameter specifications Major Features This section will briefly describe some of the major new features in Data Pump. Performance Data Pump based export / import operations are typically much faster than their original exp / imp counterparts. A single thread of Data Pump's direct path data unload is about twice as fast as original exp's direct path unload. A single thread of Data Pump data load is 15X - 45X faster than original imp. And of course, Data Pump operations can be specified with parallel threads of execution2. Also note that parallel threads can be dynamically added and removed to/from running jobs to tailor the job to the changing execution environment. During export, when there are two or more workers, data and metadata unloading proceed in parallel. Data Pump will also automatically build each index in parallel up to the parallel degree of the job. The permanent parallel degree of the index remains unchanged from its source value. I/O bandwidth is most important factor It is important to make sure there is sufficient I/O bandwidth to handle the number of parallel threads specified; otherwise performance can actually degrade with additional parallel threads. Care should be taken to make sure the dumpfile set is located on spindles other than those holding the instance's data files. Wildcard file support makes it easy to spread the I/O load over multiple spindles. For example, a specification such as: Dumpfile=dmpdir1:full1%u.dmp,dmpdir2:full2%u.dmp Dumpfile=dmpdir3:full3%u.dmp,dmpdir4:full4%u.dmp will create files named full101.dmp, full201.dmp, full301.dmp, full401.dmp, full102.dmp, full202.dmp, full302.dmp, etc. in a round-robin fashion across the four directories pointed to by the four directory objects. Initialization Parameters Essentially no tuning is required to achieve maximum Data Pump performance. Initialization parameters should be sufficient out of the box.
Data Pump's AQ-based communication and the metadata API both require some amount of SGA… make sure shared_pool_size is sufficient. Both the metadata API during export and the worker process during import may execute some fairly long-running queries that have the potential for exhausting rollback segments. This is mostly an issue in jobs affecting many objects. Make sure these are configured sufficiently large. For example, export / import of a database containing 400,000 objects required two rollback segments, each 750 Mb in size. Metadata API Metadata performance is about the same as the original exp and imp utilitities, but much more flexible and extensible. However, since most operations are dominated by data movement, most see a dramatic overall improvement. Network Mode Data Pump supports the ability to load one instance directly from another (network import) and unload a remote instance (network export). Rather than using network pipes which are not supported on all platforms, network mode uses DB links. During network import, the Metadata API executes on the remote node, extracting object definitions and sending them to the local instance for creation where the Data Pump job is executing. Data is fetched and loaded using insert as select statements such as: Insert into foo (a,b,c,…) select (a,b,c,…) from foo@remote_service_name These statements incorporate hints on both sides to access the direct path engine for maximum performance. Network export enables the ability to export read-only databases. Data Pump export cannot run locally on a read-only instance because maintaining the master table, writing messages to queues and creating external tables all require writes on the instance. Network export creates the dumpfile set on the instance where the Data Pump job is running and extracts the metadata and data from the remote instance, just as network import does. Data movement in network export is done exclusively by External Tables since 'create as select@service' style DML statements are required. With either network mode operation, it is expected that network bandwidth will become the bottleneck. Be careful that the parallel setting does not saturate the network. We discovered that many of our customers would implement a sort of 'network mode' with original exp and imp by exporting into a network pipe and importing out the other end. This overlapped the export and import operations, thus improving elapsed time. Given the dramatic improvement in the performance of file-based Data Pump operations, it is unclear if network mode can provide a significant reduction in elapsed time for instance initialization as it did with original exp and imp. Restart All stopped Data Pump jobs can be restarted without loss of data as long as the master table and dumpfile set remain undisturbed while the job is stopped. It doesn't matter if the job was stopped voluntarily with the clients' STOP_JOB command or the stoppage was involuntary due to a crash, power outage, etc. Sufficient context is maintained in the master table to know where to pick up. A client can attach to a stopped job with the ATTACH=<job name> parameter, then start it with the interactive START command. During import, sometimes an unforeseen, repeating problem with a particular object occurs that prevents further progress. START=SKIP_CURRENT will skip the current object and continue with the next, thus allowing progress to be made. Fine-grained Object Selection One could only choose to include or ignore indexes, triggers, grants and constraints with original exp and imp. With various client parameters, a Data Pump job can include or exclude virtually any type of object and any subset of objects within a type. Exclude The exclude parameter allows any database object type to be excluded from an export or import operation. The optional name qualifier allows you finer selectivity within each object type specified. For example, the following three lines in a parameter file: Exclude=function Exclude=procedure Exclude=package:”like 'PAYROLL%' “ Would exclude all functions, procedures and packages with names starting with 'PAYROLL' from the job. Include The include parameter includes only the specified object types and objects in an operation. For example, if the above three specifications were INCLUDE parameters in a full database export, only functions, procedures and packages with names starting with 'PAYROLL' would be written to the dumpfile set.. Content The content parameter allows one to request for the current operation just metadata, just data or both. Original exp's 'ROWS=N' parameter was equivalent to content=metadata_only, but there is no equivalent for content=data_only. Query The query parameter operates much as it did in original export, but with two significant enhancements:
Monitoring and Estimates Another requirement we heard from our customers was to provide better, more detailed monitoring capabilities. In addition to the standard progress and error messages printed by the client and into the log file, the new client interactive command STATUS will show detailed job information including overall percent done, the status of each worker process, the current objects being worked on and the percent done for each one. You can also specify a time interval in seconds for an automatic update of detailed status rather than manually requesting it. The start of every export job now also includes an estimate phase where the approximate amount of all data to be unloaded is determined. The default method for determining this is to estimate the size of a partition by counting the number of blocks currently allocated to it. If tables have been analyzed, statistics can also be used which should provide a more accurate estimate. This serves two purposes: 1) You get an idea of how much dumpfile space will be consumed. 2) All the information needed to start unloading tables is retrieved and ordered by size descending. This allows the MCP to schedule the unloading of metadata and data in parallel. The objects retrieved during this estimate phase are called table data objects with each representing a partition or the entire table if the table is unpartitioned. Since 0-N clients may be attached to a running job, one may start a job at work, detach from it, go home, re-attach and monitor it throughout the evening. New Clients: expdp and impdp Although the new export and import clients retain a similar look and feel to the original exp and imp clients, 100% parameter compatibility was not a goal since there are so many inconsistencies in the original clients' parameters. Where a concept makes sense for both an export and import operation, we made sure the parameter is the same for both. The new clients also support far greater capabilities: Interactive command mode - Typing control-C (^C) will invoke the interactive command mode and an 'export>' or 'import>' prompt will appear. From this prompt, you can request help, detailed job status, change monitoring parameters, dynamically add files (including wildcard specifications) to the job's dumpfile set, stop the job leaving it restartable, kill the job leaving it not restartable, change the degree of parallelism for the job, return to logging mode to continue receiving progress messages or exit the client and leave the job running. All modes of operation are supported: Full, schema, table, tablespace and transportable tablespace. Data Pump and its new expdp and impdp clients are a complete superset of original exp/imp's functionality3. Flashback is supported for exports and imports as of a certain time. WHERE clause predicates may be applied to individual tables by both expdp and impdp. Privileged users (those with either the EXP_FULL_DATABASE or IMP_FULL_DATABASE roles) may attach to and control jobs initiated by other users even if the job is stopped. Other Useful Features DDL Transformations - Because object metadata is stored as XML in the dumpfile set, it is easy to apply transformations when DDL is being formed (via XSL-T) during import. Impdp supports several transformations: REMAP_SCHEMA provides the old 'FROMUSER / TOUSER' capability to change object ownership. REMAP_TABLESPACE allows objects to be moved from one tablespace to another. This changes the tablespace definition as well. REMAP_DATAFILE is useful when moving databases across platforms that have different file system semantics. One can also specify via the TRANSFORM parameter that storage clauses should not be generated in the DDL. This is useful if the storage characteristics of the target instance are very different from those of the source. SQL file - Impdp can also perform a SQL file operation. Rather than creating database objects, this merely writes the equivalent DDL to a file in the form of a SQL script almost ready for execution. Only the embedded 'connect' statements are commented out. TABLE_EXISTS_ACTION - Original imp would allow rows to be appended to pre-existing tables if IGNORE=Y was specified. Impdp's TABLE_EXISTS_ACTION parameter provides four options:
CONTENT - This new parameter, applicable to both clients allows the movement of DATA_ONLY, METADATA_ONLY or BOTH (the default). VERSION - expdp supports the VERSION parameter which tells the server-based Data Pump to generate a dumpfile set compatible with the specified version. This will be the means to perform downgrades in the future. There is no need (as there was with original exp) to run older versioned Data Pump clients. Enterprise Manager supports a fully functional interface to Data Pump. Data Pump Views - Data Pump maintains a number of user and DBA accessible views to monitor the progress of jobs: DBA_DATAPUMP_JOBS: This view shows a summary of all active Data Pump jobs on the system. USER_DATAPUMP_JOBS: This view shows a summary of the current user's active Data Pump jobs. DBA_DATAPUMP_SESSIONS: This shows all sessions currently attached to Data Pump jobs. V$SESSION_LONGOPS: A row is maintained in the view showing progress on each active Data Pump job. The OPNAME column displays the Data Pump job name.
Original exp and imp Both original export and import will ship with Oracle Database 10g:
Differences Between Data Pump and Original exp/imp This section highlights some of the main differences you will notice when first running Data Pump-based export and import: Data Pump is designed for big jobs with lots of data. This has a number of implications:
The user running Data Pump must have sufficient tablespace quota to create the master table. Be cognizant of rollback configuration in jobs containing many objects. The progress messages displayed by the clients are different than those displayed by the original exp/imp clients, but still reflect the current object type. Also note that 'Already exists' errors are flagged as such and included in the total errors count issued at the end of the job. A log file is generated by default with a name 'export.log' or 'import.log'. This can be over-ridden with the LOGFILE parameter. Original exp and imp do not generate a log file by default. Conclusion Data Pump is a new, callable facility in the Oracle 10g database that provides very high speed loading and unloading of data and metadata. New export and import clients, expdp and impdp that fully exploit Data Pump's infrastructure are also provided in this release. They are implemented as complete supersets of original exp / imp functionality and will eventually replace them. |