AK's Programming Blog

Sunday, January 8, 2012

Database Replication: SQL SERVER 2005 to MYSQL 5.x REPLICATION

In order to Set up Replication , It is necessary to go through basic replication concepts, If you know all of them, just skip through this section of the section and see below :

So, now basically Database Replication is:

Database replication is the creation and maintenance of multiple copies of the same database. In most implementations of database replication, one database server maintains the master copy of the database and additional database servers maintain slave copies of the database. Database writes are sent to the master database server and are then replicated by the slave database servers. Database reads are divided among all of the database servers, which results in a large performance advantage due to load sharing. In addition, database replication can also improve availability because the slave database servers can be configured to take over the master role if the master database server becomes unavailable.

REPLICATION can be done in three different ways :

Snapshot replication: Data on one server is simply copied to another server, or to another database on the same server.
Merging replication: Data from two or more databases is combined into a single database.
Transactional replication: Users receive full initial copies of the database and then receive periodic updates as data changes.

REPLICATION COMPONENTS:

Publisher

The Publisher is a database instance that makes data available to other locations through replication. The Publisher can have one or more publications, each defining a logically related set of objects data to replicate.

Distributor

The Distributor is a database instance that acts as a store for replication-specific data associated with one or more Publishers. Each Publisher is associated with a single database, referred to as a distribution database, at the Distributor. The distribution database stores replication status data, metadata about the publication, and in some cases, acts as a queue for data moving from the Publisher to the Subscribers. In many cases, a single database server instance acts as both the Publisher and the Distributor. This is referred to as a local Distributor. When the Publisher and the Distributor are configured on separate database server instances, the Distributor is referred to as a remote Distributor.

Subscribers

A Subscriber is a database instance that receives replicated data. A Subscriber can receive data from multiple Publishers and publications. Depending on the type of replication chosen, the Subscriber can also pass data changes back to the Publisher or republish the data to other Subscribers.

Article

An article identifies a database object that is included in a publication. A publication can contain different types of articles, including tables, views, stored procedures, and other objects. When tables are published as articles, filters can be used to restrict the columns and rows of the data sent to Subscribers.

Publication

A publication is a collection of one or more articles from one database. The publication grouping of multiple articles makes it easier to specify a logically related set of database objects and data that are replicated as a unit.

Subscription

A subscription is a request for a copy of a publication to be delivered to a Subscriber. The subscription defines what publication will be received, where, and when. There are two types of subscriptions: push and pull.

This can be illustrated by an example , A "publisher publishes a book that is recieved by distributors and then it dilevers the book to each of the subscribers".

Now that we are up with loads of concepts about replication we can work out for our real problem:

Database Replication from Sql Server 2005 to Mysql 5.x

Recently, I was setting up replication from SQL server 2005 to Mysql 5.x. I spent a lot of hours to research on setting up the task. Some blogs and articles didnt mention the replication between SQL Server 2005 and Mysql, as mysql do not officially allow the replication between them, they always mention Oracle as the possible non-SQL subscriber. As scene in the following screenshot

So, that is why we need a way that should other than this wizard. i.e we should set up replication with stored procedure provided by SQL Server 2005.

First task is to make a publisher for a database. Which is achieved by the stored procedure sp_addpublication, in our case we executed this stored procedure as :

exec sp_addpublication @publication = N'mydb'

, @description = N'Transactional publication of database'

, @sync_method = N'concurrent_c'

, @retention = 0

, @allow_push = N'true'

, @allow_pull = N'false'

, @allow_anonymous = N'true'

, @enabled_for_internet = N'false'

, @snapshot_in_defaultfolder = N'true'

, @compress_snapshot = N'false'

, @ftp_port = 21

, @allow_subscription_copy = N'false'

, @add_to_active_directory = N'false'

, @repl_freq = N'continuous'

, @status = N'active'

, @independent_agent = N'true'

, @immediate_sync = N'true'

, @allow_sync_tran = N'false'

, @allow_queued_tran = N'false'

, @allow_dts = N'false'

, @replicate_ddl = 0

, @allow_initialize_from_backup = N'false'

, @enabled_for_p2p = N'false'

, @enabled_for_het_sub = N'true'

, @autogen_sync_procs = 'false'

We used "@sync method" to be concurrent_c as because,

[ @sync_method=] 'sync_method'

Is the synchronization mode. sync_method is nvarchar(13), and can be one of the following values.

Value	Description
native	Produces native-mode bulk copy program output of all tables. Not supported for Oracle Publishers.
character	Produces character-mode bulk copy program output of all tables. For an Oracle Publisher, character is valid only for snapshot replication.
concurrent	Produces native-mode bulk copy program output of all tables but does not lock tables during the snapshot. Only supported for transactional publications. Not supported for Oracle Publishers.
concurrent_c	Produces character-mode bulk copy program output of all tables but does not lock tables during the snapshot. Only supported for transactional publications.
database snapshot	Produces native-mode bulk copy program output of all tables from a database snapshot. This option requires SQL Server Enterprise.
database snapshot character	Produces character-mode bulk copy program output of all tables from a database snapshot. This option requires SQL Server Enterprise.
NULL (default)	Defaults to native for Microsoft SQL Server Publishers. For non-SQL Server Publishers, defaults to character when the value of repl_freq is Snapshot and to concurrent_c for all other cases.

[ @repl_freq=] 'repl_freq'

We used replication frequency to be continous as we need to set up transactional replication.

@enabled_for_het_sub = N'true'

We needed to support heterogenous replication.

Now, What is Heterogeneous Replication?

Heterogeneous replication works basically the same way, but means having a Non-SQL Server in the role of publisher or subscriber. As our subscriber will be Mysql 5.x.

Now that we have set up publication, Secondly we need to add articles to the publication,

--add table 1

exec sp_addarticle @publication = N'aclc_pub'

, @article = N'SS_SetupTarget' -- your database table

, @source_owner = N'dbo'

, @source_object = N'SS_SetupTarget'

, @type = N'logbased'
, @pre_creation_cmd = N'delete'

, @ins_cmd = N'SQL'

, @del_cmd = N'SQL'

, @upd_cmd = N'SQL'

, @schema_option = 0x8000000

, @status = 24

--add all of the columns to the article

exec sp_articlecolumn @publication = N'aclc_pub'

, @article = N'SS_SetupTarget'

, @refresh_synctran_procs = 1

--end add table 1

So, now here comes another critical point,

Actually for transactional replication for non SQL subscriber, SQL server generates the bulk inserts for all the rows in the table. And what if the object is already set up there. We use @pre_creation_cmd = N'delete', so now the SQL Server will delete all the rows from the subscriber and then inserts all the rows. If pre creation cmd is none than it will dump all the rows, which can result in duplication of records accros your Mysql Database. This case is limited to mysql or other non-sql server subscriber as on homogeneous subscribers it only publishes the changed or inserted rows.

It can be a good idea if you only add the row inserted all along this period. But I have chosen so that any single row is not missed, as our data need to be really consistent. There is another replication technique where you move the subscriber to sql server subscriber and then move the replicated rows to mysql and delete the entire rows of the middle database tables.

SO now arise another question , whats this “@schema_option = 0x8000000”.

Actually schema option is the bimask for schema generation . You can have a look and set up according to your need by looking at : http://technet.microsoft.com/en-us/library/ms173857.aspx

Now, we should set snapshot agent for specified publication.

exec sp_addpublication_snapshot @publication = N'aclc_pub', @frequency_type = 4, --daily -- @frequency_interval = 1, @frequency_relative_interval = 1,
@frequency_recurrence_factor = 0,
@frequency_subday = 4,
@frequency_subday_interval = 15, @active_start_time_of_day = 0, @active_end_time_of_day = 235959,
@active_start_date = 20111227,
@active_end_date = 99991231,
@job_login = null, @job_password = null,
@publisher_security_mode = 1GO

Add publication creates a snapshot agent. We need to set up the time after which it should re initialize itself by using @frequency variables.

@frequency_type = 4 – which is daily, as we want the data to be synchronized daily , @frequency_relative_interval = 1 - unused,

@frequency_recurrence_factor = 0

@frequency_subday = 4,

@frequency_subday_interval = 15 – these are the minutes after which it generates a snapshot,

For our case we should set frequency_subday to be 8(in hours) and @frequency_subday_interval to 24.

In our case, we should set @active_start_time_of_day half an hour before the

Now that snapshot agent is set up we should move to set up subscriber.

But before that we need to Set up MySQL ODBC. Now download and install mysql ODBC connector from http://dev.mysql.com/downloads/connector/odbc .

Now, you go to ODBC connector > System DSN > Add.. > Mysql odbc 3.5 > OK.

Now that you have setup DSN on your windows machine

exec sp_addsubscription
@publication = N'mySqlReplication',
@subscriber = N'REPLICATION',
@destination_db = N'employee_info',
@subscription_type = N'Push',
@sync_type = N'automatic',
@article = N'all',
@update_mode = N'read only',
@subscriber_type = 1

exec sp_addpushsubscription_agent
@publication = N'mySqlReplication',
@subscriber = N'REPLICATION',
@subscriber_db = N'employee_info',
@job_login = null, @job_password = null,
@subscriber_security_mode = 0,
@subscriber_login = N'root',
@subscriber_password = null,
@subscriber_provider = N'MSDASQL',
@subscriber_datasrc = N'replication',
@frequency_type = 64,

@frequency_interval = 1,
@frequency_relative_interval = 1,
@frequency_recurrence_factor = 0,
@frequency_subday = 4,
@frequency_subday_interval = 5,
@active_start_time_of_day = 0,
@active_end_time_of_day = 235959,
@active_start_date = 0,
@active_end_date = 0,
@dts_package_location = N'Distributor'

You should mention your DSN name in @subscriber, @subscriber_db should be the mysql database you wish to replicate. @subscriber_login and @subscriber_password should be the username and password for you DSN. For sp_addpushsubscription_agent you should set up the frequency after which the subscriber will be refreshed with the snapshot generated by the snapshot agent.

You might need some extra tools to set up, specially you will need schema of the sql server 2005 to be set up for mysql. You should download Mysql Migration toolkit from mysql site.

Sunday, January 1, 2012

Drupal Security

Recently, I was working for governmental site which need to be really secure. I was using Drupal as CMS / framework, so it was needed that i should look into the drupal short comings to security, but i really found that drupal has more than enough of what a framework should offer to address security. As a developer we just need to use its existing features to enhance security.

I really need to go through following Scenarios where security needed to be up to the mark and solutions we made :

1. SQL Injection

What SQL Injection is:

SQL Injection is subset of the an unverified/unsanitized user input vulnerability ("buffer overflows" are a different subset), and the idea is to convince the application to run SQL code that was not intended.

There can be many type of SQL Injections which can be prevented with Drupal by taking following Steps:.

i. Use db_query with its placeholders as drupal passes the inputs from sql injection filters before sending the query to database.

db_query('SELECT foo FROM {table} t WHERE t.name = '. $_GET['user']);

db_query("SELECT foo FROM {table} t WHERE t.name = '%s' ", $_GET['user']);

db_query("SELECT t.s FROM {table} t WHERE t.field IN (%s)", $from_user);

$placeholders = implode(',', array_fill(0, count($from_user), "%d"));

db_query("SELECT t.s FROM {table} t WHERE t.field IN ($placeholders)", $from_user);

Drupal core ‘db_query ’ performs all the necessary checks against sql injections. For integer checks it insures that it is numeric , so the query do not break.

ii. Use check_plain, check_markup, filter_xss on user inputs.

check_plain($string) --> To present all HTML as encoded entities.

check_markup($string) --> To allow at least some HTML. When a user has selected a specific format. When you are unsure of the format, and need HTML, but need to limit the HTML that is allowed, use the‘‘default’’ format as a fallback.

filter_xss($string, $allowed_tags ) --> Removes characters and constructs that can trick browsers. Makes sure all HTML entities are well-formed. Makes sure all HTML tags and attributes are well-formed. Makes sure no HTML tags contain URLs with a disallowed protocol (e.g. javascript:)

You can also use filter_xss_admin where admin is to make an input.

2. Security Testing

There are number of Drupal Specific testing solutions Such as :

Coder Module (http://drupal.org/project/coder):

The Coder module is a powerful tool for analyzing Drupal code. The module was created by Doug Green, but it has since had significant improvements by many users. Initially it analyzed code to ensure it conformed to the Drupal coding standards and to help identify changes from one version of Drupal to another, but since it is built in an extensible manner, it can perform many different kinds of source-code analysis. It has been expanded to include some simple security checks and could be expanded to cover more security tests.

How to use Coder Module:

Code Review, where you can control the default review to be performed. To actually run the review, visit the /coder path on your site, where you will see a screen that allows you to select which tests to run. Click the Submit button and, after a few seconds, Coder presents a report about the tests it ran and any problems it identified.

Security Scanner (http://drupal.org/project/security_scanner )

How to use the Security Scanner tool:

1. Enable the Scanner and XSS components of the module.

2. Visit Administer Site Configuration Security Scanner,

3. Set Mode in the Security Scanner Settings section to Crawl.

4. Execute the Crawl by visiting the cron.php file.

5. Repeat this process of setting a mode and visiting cron.php for the

next two modes: Seed and ReCrawl.

Grendel Scan ( http://www.grendel-scan.com/download.htm )

A part from Drupal specific / based tools, there are also several general tools available to perform vulnerability analysis. Many of these tools tackle individual pieces: SQL injection, XSS, and providing a local proxy that allows a user to manually alter browser requests. There is also a relatively new tool called Grendel-Scan which leverages many existing tools to be able to provide an amazing array of scanning and vulnerability analysis tools.

2. Form Flooding

use flood_is_allowed() in your custom modules which controls flooding of a form by limiting the number of form submission allowed. enable token.

http://drupal.org/project/flood_exemption

This is a module which helped us a lot when we were addressing form flooding. As, it limits the number of submissions from a particular IP.You can also enhance the functionality by limiting it to a particular MAC address.

3. Password Cracking by the of simple brute force algorithm

Enabling this module, a site administrator may limit the number of invalid login attempts before blocking accounts. This is how the brute force algorithm works.

http://www.ethicalhackingguide.net/2011/08/using-fireforce-to-brute-force-web_04.html

http://drupal.org/project/login_security

Enabling this module, a site administrator may limit the number of invalid login attempts before blocking accounts, or denying access by IP address, temporarily or permanently.

4. Spam Blocking

Mollom can be other option but it is paid service.

http://drupal.org/project/spam

This module can certainly block the spam entries on your blog or forum. Wordpress’s Akismet is the most successful component for spam blocking.

5. Directory Browsing

Web server directories should have the configuration lock-down the browsing of directories.

6. Session Hijacking

Use web application server session management APIs when possible. Tools like fire sheep can be used to test Session Hijacking.