Wednesday, March 28, 2012

Help! transactional replication

We have a 100+ GB database as our production db. We need to replicate to a
subscriber across a wan to have absolute minimal latency.
Originally this was configured to be a push subscription with the publisher
acting as the distributer. The distribution agent ran on the publisher as
well.
We have been having problems all over the place, and I want to recommend to
my boss to start anew with a different configuration. After reading some
parts of Hiliary's book and many posts, this is what I "think" would be the
best. Please let me know if this sounds right. Our biggest problems are a
slower production server, many locks, and the dist cleanup agent not being
able to keep up when run nightly, but when run continuously locking with the
other agents.
Proposed plan:
drop the current subscription (which has to be droped anyways because it
needs to be manually reinitialized.)
remove the production DB as a publisher and drop the distribution database.
use sqllite to create a backup of 6 files that are compressed and push the
files over the wan (also have an IT person send the files express mail as a
backup)
restore backup to db that will be subscriber
continue to let the production db make hourly transaction log backups, and
pull these over the wire to be applied as well to make up for the lag time
of pulling main backup over the wire.
restore transaction log backups
set up production db as publisher AND if approved use a remote distributer
set up subscriber as pull subscription (if we can get a remote distributer,
than do I need the distribution agent to run on the subscriber? What about
if we can't get a remote distributer?)
I need the data on the subscriber to be as up to date as physically
possible. A few seconds is the goal.
Please let me know if you need anymore info and if you think this would be a
better/optimal setup.
Thanks,
Kristy
Also, what type of machine would be needed for a remote distributor?
Questions from management:
What kind of machine does it need to be? Can it be a single processor
machine and how much memory and hard disk space would be required?
Could we run it on a subscriber machine?
--Kristy
"Kristy" <pleasereplyby@.posting.com> wrote in message
news:O9le4avLFHA.568@.TK2MSFTNGP09.phx.gbl...
> We have a 100+ GB database as our production db. We need to replicate to a
> subscriber across a wan to have absolute minimal latency.
> Originally this was configured to be a push subscription with the
publisher
> acting as the distributer. The distribution agent ran on the publisher as
> well.
> We have been having problems all over the place, and I want to recommend
to
> my boss to start anew with a different configuration. After reading some
> parts of Hiliary's book and many posts, this is what I "think" would be
the
> best. Please let me know if this sounds right. Our biggest problems are a
> slower production server, many locks, and the dist cleanup agent not being
> able to keep up when run nightly, but when run continuously locking with
the
> other agents.
> Proposed plan:
> drop the current subscription (which has to be droped anyways because it
> needs to be manually reinitialized.)
> remove the production DB as a publisher and drop the distribution
database.
> use sqllite to create a backup of 6 files that are compressed and push the
> files over the wan (also have an IT person send the files express mail as
a
> backup)
> restore backup to db that will be subscriber
> continue to let the production db make hourly transaction log backups, and
> pull these over the wire to be applied as well to make up for the lag time
> of pulling main backup over the wire.
> restore transaction log backups
> set up production db as publisher AND if approved use a remote distributer
> set up subscriber as pull subscription (if we can get a remote
distributer,
> than do I need the distribution agent to run on the subscriber? What about
> if we can't get a remote distributer?)
>
> I need the data on the subscriber to be as up to date as physically
> possible. A few seconds is the goal.
> Please let me know if you need anymore info and if you think this would be
a
> better/optimal setup.
> Thanks,
> Kristy
>
|||Be careful about placement of your remote distributor. You want it as close
physically to your publisher as possible. You want it clustered for maximum
uptime.
If you put your distributor on your subscriber across your WAN you will have
log reader agent errors galore if you have a wan congestion or if your link
goes down. When this happens your tlog will balloon and then you will have
problems getting your log reader agent started again.
Then what you really need to do is find a period of quiescence where there
is no activity on your publisher and then do your backup and do a no sync
subscription, but stop your distribution agent. BTW - I would probably use a
pull to offload your distribution agent to your subscriber.
Then I'd set my PollingInterval on both my Log Reader and Distribution
agents to 1. Replicate the execution of stored procedures wherever possible,
use the independent agent option, and group your articles into separate
publications according to DRI.
However - for this type of a scenario - you might want to look at hardware
data mirroring. This will offload processing to hardware, and latencies can
be much smaller. The problem with them is that they can be very expensive,
and your entire array is replicated - so you can't do any business logic or
filtering.
If the cost of latency is significant to you, you have to use a solution
like this.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Kristy" <pleasereplyby@.posting.com> wrote in message
news:O9le4avLFHA.568@.TK2MSFTNGP09.phx.gbl...
> We have a 100+ GB database as our production db. We need to replicate to a
> subscriber across a wan to have absolute minimal latency.
> Originally this was configured to be a push subscription with the
publisher
> acting as the distributer. The distribution agent ran on the publisher as
> well.
> We have been having problems all over the place, and I want to recommend
to
> my boss to start anew with a different configuration. After reading some
> parts of Hiliary's book and many posts, this is what I "think" would be
the
> best. Please let me know if this sounds right. Our biggest problems are a
> slower production server, many locks, and the dist cleanup agent not being
> able to keep up when run nightly, but when run continuously locking with
the
> other agents.
> Proposed plan:
> drop the current subscription (which has to be droped anyways because it
> needs to be manually reinitialized.)
> remove the production DB as a publisher and drop the distribution
database.
> use sqllite to create a backup of 6 files that are compressed and push the
> files over the wan (also have an IT person send the files express mail as
a
> backup)
> restore backup to db that will be subscriber
> continue to let the production db make hourly transaction log backups, and
> pull these over the wire to be applied as well to make up for the lag time
> of pulling main backup over the wire.
> restore transaction log backups
> set up production db as publisher AND if approved use a remote distributer
> set up subscriber as pull subscription (if we can get a remote
distributer,
> than do I need the distribution agent to run on the subscriber? What about
> if we can't get a remote distributer?)
>
> I need the data on the subscriber to be as up to date as physically
> possible. A few seconds is the goal.
> Please let me know if you need anymore info and if you think this would be
a
> better/optimal setup.
> Thanks,
> Kristy
>
|||It should be a dual processor at least. I can't give you answers on memory
or hard disk space, basically as you haven't given me enough information,
and even if you did, I doubt I would be able to give you a useful spec. I
would pick a top of the line machine, 4G of Ram, Xeon processors. No idea of
the hard drive size, as I have no idea how much data you are pushing daily.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Kristy" <pleasereplyby@.posting.com> wrote in message
news:upkqxwvLFHA.2604@.TK2MSFTNGP10.phx.gbl...[vbcol=seagreen]
> Also, what type of machine would be needed for a remote distributor?
> Questions from management:
> What kind of machine does it need to be? Can it be a single processor
> machine and how much memory and hard disk space would be required?
> Could we run it on a subscriber machine?
> --Kristy
>
> "Kristy" <pleasereplyby@.posting.com> wrote in message
> news:O9le4avLFHA.568@.TK2MSFTNGP09.phx.gbl...
a[vbcol=seagreen]
> publisher
as[vbcol=seagreen]
> to
> the
a[vbcol=seagreen]
being[vbcol=seagreen]
> the
> database.
the[vbcol=seagreen]
as[vbcol=seagreen]
> a
and[vbcol=seagreen]
time[vbcol=seagreen]
distributer[vbcol=seagreen]
> distributer,
about[vbcol=seagreen]
be
> a
>
|||Thanks Hilary.
The remote distributor that is available has multiple disks C: 80GB, E:250GB
and F:80GB single processor. It also has 1.5GB RAM. This sounds like it is
not enough RAM. I'll see if they will add some. Won't be able to get a dual
processor though.
Are you saying that the distributor needs to be clustered? Meaning another
machine. Not sure if that one will fly.Maybe in the future, but I am pushing
it to get a remote server in the first place.
Between the time that I do the backup and am able to restore locally does
the distribution agent need to be stopped? What will the publisher still be
able to replicate all of the commands that have happened in the interum?
There is about a 36 hour window before I can get the data to our local
server and I can not have the production server down that long. I'd rather
have a build up of commands that will take a few days to catch up.
You wrote:
> use the independent agent option, and group your articles into separate
> publications according to DRI.
Our app doesn't use stored procs (heavy sigh) because they want the app to
be DB independant. (I'm trying to convince otherwise). Should there be
multiple publications from the DB with articles grouped by DRI for the same
DB going to multiple subscriptions to the same DB? I'm really confused on
that one.
THanks so much for your help!
--Kristy
"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:ODsIAd7LFHA.2704@.TK2MSFTNGP15.phx.gbl...
> Be careful about placement of your remote distributor. You want it as
close
> physically to your publisher as possible. You want it clustered for
maximum
> uptime.
> If you put your distributor on your subscriber across your WAN you will
have
> log reader agent errors galore if you have a wan congestion or if your
link
> goes down. When this happens your tlog will balloon and then you will have
> problems getting your log reader agent started again.
> Then what you really need to do is find a period of quiescence where there
> is no activity on your publisher and then do your backup and do a no sync
> subscription, but stop your distribution agent. BTW - I would probably use
a
> pull to offload your distribution agent to your subscriber.
> Then I'd set my PollingInterval on both my Log Reader and Distribution
> agents to 1. Replicate the execution of stored procedures wherever
possible,
> use the independent agent option, and group your articles into separate
> publications according to DRI.
> However - for this type of a scenario - you might want to look at hardware
> data mirroring. This will offload processing to hardware, and latencies
can
> be much smaller. The problem with them is that they can be very expensive,
> and your entire array is replicated - so you can't do any business logic
or[vbcol=seagreen]
> filtering.
> If the cost of latency is significant to you, you have to use a solution
> like this.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
> "Kristy" <pleasereplyby@.posting.com> wrote in message
> news:O9le4avLFHA.568@.TK2MSFTNGP09.phx.gbl...
a[vbcol=seagreen]
> publisher
as[vbcol=seagreen]
> to
> the
a[vbcol=seagreen]
being[vbcol=seagreen]
> the
> database.
the[vbcol=seagreen]
as[vbcol=seagreen]
> a
and[vbcol=seagreen]
time[vbcol=seagreen]
distributer[vbcol=seagreen]
> distributer,
about[vbcol=seagreen]
be
> a
>
|||I was just throwing the RAM figure out there. You might be able to get by
with 1.5, basically you have to monitor paging to see if you have memory
pressures/problems. Its really a function of throughput.
Your problem is going to be what happens when your remote distributor goes
down. This will cause your tlogs on your publisher to balloon. Ideally you
will need to cluster this machine. If you don't you will have a single point
of failure which will make your system vulnerable. I would probably look at
having a local distributor in this case. However depending on throughput
this could cause locking problems. You don't need a cluster, but a cluster
will help.
The distribution agent will have to be stopped until you get your subscriber
ready.
Regarding your question on multiple publications and multiple subscribers -
the answer is yes - for each publication you will need a subscription. If
you have 4 publications, you will need 4 subscriptions to the same
subscription database on the same subscriber.
It sounds like your manager is being unrealistic - he/she wants low latency
but is not willing to pay for it. Either the cost of latency is very high
for you and you should purchase serious hardware, or the cost of latency is
low, and you will have to live with low cost solutions.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Kristy" <pleasereplyby@.posting.com> wrote in message
news:eGkZTr9LFHA.508@.TK2MSFTNGP12.phx.gbl...
> Thanks Hilary.
> The remote distributor that is available has multiple disks C: 80GB,
E:250GB
> and F:80GB single processor. It also has 1.5GB RAM. This sounds like it
is
> not enough RAM. I'll see if they will add some. Won't be able to get a
dual
> processor though.
> Are you saying that the distributor needs to be clustered? Meaning another
> machine. Not sure if that one will fly.Maybe in the future, but I am
pushing
> it to get a remote server in the first place.
> Between the time that I do the backup and am able to restore locally does
> the distribution agent need to be stopped? What will the publisher still
be
> able to replicate all of the commands that have happened in the interum?
> There is about a 36 hour window before I can get the data to our local
> server and I can not have the production server down that long. I'd rather
> have a build up of commands that will take a few days to catch up.
> You wrote:
> Our app doesn't use stored procs (heavy sigh) because they want the app to
> be DB independant. (I'm trying to convince otherwise). Should there be
> multiple publications from the DB with articles grouped by DRI for the
same[vbcol=seagreen]
> DB going to multiple subscriptions to the same DB? I'm really confused on
> that one.
>
> THanks so much for your help!
> --Kristy
> "Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
> news:ODsIAd7LFHA.2704@.TK2MSFTNGP15.phx.gbl...
> close
> maximum
> have
> link
have[vbcol=seagreen]
there[vbcol=seagreen]
sync[vbcol=seagreen]
use[vbcol=seagreen]
> a
> possible,
hardware[vbcol=seagreen]
> can
expensive,[vbcol=seagreen]
> or
to[vbcol=seagreen]
> a
> as
recommend[vbcol=seagreen]
some[vbcol=seagreen]
be[vbcol=seagreen]
are[vbcol=seagreen]
> a
> being
with[vbcol=seagreen]
it[vbcol=seagreen]
> the
> as
> and
> time
> distributer
> about
would
> be
>
|||Hilary:
A follow-up question...
As I understood it, if your connection from your distributor to your
subscriber goes down, it wouldn't be the transaction log of the publishing
database that balloons, but actually the Distribution database itself, since
this is where all the commands that need to be replicated are kept.
You can still perform normal log backups while that link is down, can't you?
Please correct me if my understanding is incorrect.
Thank You
"Hilary Cotter" wrote:

> I was just throwing the RAM figure out there. You might be able to get by
> with 1.5, basically you have to monitor paging to see if you have memory
> pressures/problems. Its really a function of throughput.
> Your problem is going to be what happens when your remote distributor goes
> down. This will cause your tlogs on your publisher to balloon. Ideally you
> will need to cluster this machine. If you don't you will have a single point
> of failure which will make your system vulnerable. I would probably look at
> having a local distributor in this case. However depending on throughput
> this could cause locking problems. You don't need a cluster, but a cluster
> will help.
> The distribution agent will have to be stopped until you get your subscriber
> ready.
> Regarding your question on multiple publications and multiple subscribers -
> the answer is yes - for each publication you will need a subscription. If
> you have 4 publications, you will need 4 subscriptions to the same
> subscription database on the same subscriber.
> It sounds like your manager is being unrealistic - he/she wants low latency
> but is not willing to pay for it. Either the cost of latency is very high
> for you and you should purchase serious hardware, or the cost of latency is
> low, and you will have to live with low cost solutions.
>
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
> "Kristy" <pleasereplyby@.posting.com> wrote in message
> news:eGkZTr9LFHA.508@.TK2MSFTNGP12.phx.gbl...
> E:250GB
> is
> dual
> pushing
> be
> same
> have
> there
> sync
> use
> hardware
> expensive,
> to
> recommend
> some
> be
> are
> with
> it
> would
>
>
|||Thanks so much Hilary!
We've had the transaction logs balloon before because of similar issues and
we have the space and some processes in place to accomadate if that were to
happen. We currently have had the distributor local and it has just caused
too many locking problems.
With the distribution agent stopped, will the transactions still be recorded
that need to be pushed to the subscriber? Is this the log agent?
I'm still confused on the multiple publication/multiple subscriber issue. I
have one 110 GB database that is our production database. The DB needs to be
replicated to one subsriber database. Should I make this into multiple
publications and multiple subscriptions (same DB different articles) for the
one on one databases?
Also, is a single processor machine for the distributor server going to be
enough? Currently approximately 500,000 commands (in the MSrepl_commands
table) are logged a day.
--k
"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:evE7D89LFHA.3988@.tk2msftngp13.phx.gbl...
> I was just throwing the RAM figure out there. You might be able to get by
> with 1.5, basically you have to monitor paging to see if you have memory
> pressures/problems. Its really a function of throughput.
> Your problem is going to be what happens when your remote distributor goes
> down. This will cause your tlogs on your publisher to balloon. Ideally you
> will need to cluster this machine. If you don't you will have a single
point
> of failure which will make your system vulnerable. I would probably look
at
> having a local distributor in this case. However depending on throughput
> this could cause locking problems. You don't need a cluster, but a cluster
> will help.
> The distribution agent will have to be stopped until you get your
subscriber
> ready.
> Regarding your question on multiple publications and multiple
subscribers -
> the answer is yes - for each publication you will need a subscription. If
> you have 4 publications, you will need 4 subscriptions to the same
> subscription database on the same subscriber.
> It sounds like your manager is being unrealistic - he/she wants low
latency
> but is not willing to pay for it. Either the cost of latency is very high
> for you and you should purchase serious hardware, or the cost of latency
is[vbcol=seagreen]
> low, and you will have to live with low cost solutions.
>
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
> "Kristy" <pleasereplyby@.posting.com> wrote in message
> news:eGkZTr9LFHA.508@.TK2MSFTNGP12.phx.gbl...
> E:250GB
> is
> dual
another[vbcol=seagreen]
> pushing
does[vbcol=seagreen]
> be
rather[vbcol=seagreen]
separate[vbcol=seagreen]
to[vbcol=seagreen]
> same
on[vbcol=seagreen]
will[vbcol=seagreen]
> have
> there
> sync
> use
separate[vbcol=seagreen]
> hardware
latencies[vbcol=seagreen]
> expensive,
logic[vbcol=seagreen]
solution[vbcol=seagreen]
replicate[vbcol=seagreen]
> to
publisher[vbcol=seagreen]
> recommend
> some
> be
> are
> with
because[vbcol=seagreen]
> it
push[vbcol=seagreen]
mail[vbcol=seagreen]
backups,[vbcol=seagreen]
lag
> would
>
|||you are completely correct for a local distributor! However, if you have a
remote distributor, the log reader agent will not be marking the
transactions as replicated in the log, and hence your transaction log will
start to balloon.
You can perform normal log dumps, but then your log will continue to grow.
What happens is that when you do a tlog dump, your log is not truncated, or
shrunk in any way. What happens is the status column on the vlf's is set to
0, which means that particular vlf can be used again.
When your distributor comes up, your log reader agent will be able to read
the transactions in the log, write them to the distribution database, and
then when you dump your tlog your status column of all the vlf's will be set
to 0, and the vlf can be used again.
Its a subtlety, but one which can bite you if you are not careful.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"A. Robinson" <ARobinson@.discussions.microsoft.com> wrote in message
news:D56FDDCD-2A49-4ED6-9011-ED37FD5759D7@.microsoft.com...
> Hilary:
> A follow-up question...
> As I understood it, if your connection from your distributor to your
> subscriber goes down, it wouldn't be the transaction log of the publishing
> database that balloons, but actually the Distribution database itself,
since
> this is where all the commands that need to be replicated are kept.
> You can still perform normal log backups while that link is down, can't
you?[vbcol=seagreen]
> Please correct me if my understanding is incorrect.
> Thank You
> "Hilary Cotter" wrote:
by[vbcol=seagreen]
goes[vbcol=seagreen]
you[vbcol=seagreen]
point[vbcol=seagreen]
at[vbcol=seagreen]
cluster[vbcol=seagreen]
subscriber[vbcol=seagreen]
subscribers -[vbcol=seagreen]
If[vbcol=seagreen]
latency[vbcol=seagreen]
high[vbcol=seagreen]
is[vbcol=seagreen]
it[vbcol=seagreen]
another[vbcol=seagreen]
does[vbcol=seagreen]
still[vbcol=seagreen]
interum?[vbcol=seagreen]
rather[vbcol=seagreen]
separate[vbcol=seagreen]
app to[vbcol=seagreen]
on[vbcol=seagreen]
as[vbcol=seagreen]
will[vbcol=seagreen]
your[vbcol=seagreen]
will[vbcol=seagreen]
probably[vbcol=seagreen]
Distribution[vbcol=seagreen]
separate[vbcol=seagreen]
latencies[vbcol=seagreen]
logic[vbcol=seagreen]
solution[vbcol=seagreen]
replicate[vbcol=seagreen]
publisher[vbcol=seagreen]
reading[vbcol=seagreen]
would[vbcol=seagreen]
problems[vbcol=seagreen]
not[vbcol=seagreen]
locking[vbcol=seagreen]
because[vbcol=seagreen]
push[vbcol=seagreen]
mail[vbcol=seagreen]
backups,[vbcol=seagreen]
lag[vbcol=seagreen]
What[vbcol=seagreen]
physically[vbcol=seagreen]

No comments:

Post a Comment