Monday, March 26, 2012

Help! store document in SQL server

I am using Access2002 and SQL server 2k. I need to store some documents (PDF) in the database as embedded object that can be activated and edited by double-click. How can I implement this in VBA/T-SQL in the same way that "insert embedded object" does?

UPDATETEXT or WRITETEXT will just store the document as a "long binary data". I've heard it's better only store the path in database but I need to open the database for the remote access via Internet. Can I achieve this by just storing the file path?

Any help appreciated!You heard correctly about using the file system to store the documents, and only storing the path in the database. The file system will give you much better performance.

You will want to store the documents in a folder on your web-accessible server (web server?), in a folder with Read security permissions granted for your users (either end-user account for secure server, or web server account for public server). In the DB, the path should point to the URL of the file, such as "http://myserver.mydomain.com/files/myfile.pdf" and not the physical location ("c:/files/myfile.pdf").|||Or you could store the physical path to the PDF in the server, then when you need to send the file to them, you stream the data to them (which is different then pointing them to a URL).|||Thanks bpdWork and Seppuku! I now have a clue how to do it.

Seppuku, do you mean use somthing like ADO stream to send files and then SaveToFile at remote computers?|||Originally posted by whfrank
Thanks bpdWork and Seppuku! I now have a clue how to do it.

Seppuku, do you mean use somthing like ADO stream to send files and then SaveToFile at remote computers?

Exactly... If you stream the file as an attachment it will force the user to download, otherwise some browsers will attempt to associate it with an application (such as Word, Excel, PDF, etc) and open the document in the browser. You can do it either way. I've done both.|||Exactly. If the user opening the document in the browser is acceptable, then a file reference will be a lot less work than the streaming. If not, then streaming os the way to go.|||Originally posted by bpdWork
Exactly. If the user opening the document in the browser is acceptable, then a file reference will be a lot less work than the streaming. If not, then streaming os the way to go.

Unless you want to secure the location of the document. There may be a time when you want to download a doc into the browser window, but it's location is secure (due to username/password or some other criteria). To secure it, you may still want to stream the document to the browser so you can validate the user against any security requirements you may have. When streaming a file, you can force the download by setting the Content Disposition header as an attachement. If you ignore that, it will download into the browser (assuming there's an association), but still gives you the ability to validate the user.

An example would be a document management system. If you just provide URLs to the documents, anyone can get them without authenticating. If you stream the document to them, you can ensure they are logged in and have the proper rights to the document, and then can send that document to them to load in their browser (or download, by forcing the Content Disposition to attachment).|||Actually, you can secure the documents in the file system using NTsecurity as well. Unauthorized uses will not be able to download documents they do not have access to.|||Yes but that requires you to have administrative rights on the server to setup and manage. Besides, the users you want to access the files don't always have accounts on your NT domain, requiring you to set up an account for everyone. If this is on an Intranet, NT auth may work fine, but on the Internet, chances are they won't be on your domain.

If you really want to go that route, a LDAP solution works well. I'm doing that right now on a site using Siteminder and Secureway. It prompts for user access similar to NT auth, but can be updated through a web interface.|||Originally posted by Seppuku
Exactly... If you stream the file as an attachment it will force the user to download, otherwise some browsers will attempt to associate it with an application (such as Word, Excel, PDF, etc) and open the document in the browser. You can do it either way. I've done both.

If I use Access to develop the front-end application, the physical path stored in SQL server is like "C:/myfile.pdf", but this is the location on the server. How to write a procedure in Access on remote computers stream it and send it? I'm a bit confused, please help me out. Thanks!|||Originally posted by whfrank
If I use Access to develop the front-end application, the physical path stored in SQL server is like "C:/myfile.pdf", but this is the location on the server. How to write a procedure in Access on remote computers stream it and send it? I'm a bit confused, please help me out. Thanks!

Are you using Access as your database on the Internet? Or just to test with? Access should never be used as an Internet database backend.

As for how to stream it, what server-side scripting language are you using on your website?|||you guys are all saying that it's slow... Actually, unless you use "text in row" option on the table, - only pointers to text/image pages are stored in the table. Also, if you implement methodology that requires to store the actual documents in the file system, you just expanded your responsibility from being a dba to also being in charge of the directory structure, os access right, etc., etc... And that's not to mention disaster recovery and business continuity... Just a thought :)|||Actually, even without text-in-row turned on, the B-Tree structure used by SQL is much slower than the file system. SQL stores a pointer to the B-tree structure in the table. Depending on the file size, the b-tree breaks the file up into little pieces, with a string of pointers connecting them. Upon retrieval, the pieces are reassembled into a complete file. MS actually recommends using the file system for this type of storage if it is an option.

I belive from his last post the whfrank is using Access as the front end, and connecting to a SQL backend over the internet. I haven't played with Access front-ends in a while (thank god!!), so I have no idea how it would deal with BLOBs, or streaming for that matter, but I am sure by now they have been built in. Maybe MS has something about how to use BLOBs through Access, which I am sure will be the limiting factor.|||From past experience, the best performance I've seen from storing BLOBs in the DB is when they are small (such as images).

I agree that you start getting into a gray area when you start storing on the disk w/ pointers in the DB, but that's why you have system admins and security for in the first place. If you're doing this right, then you have disaster recovery on the disks and in SQL Server. You should have a team of system admins and security personnel. All of that is their responsibility to maintain.

Just as bpd said, everything has to be pulled from the disk whether it's stored in the DB (which is just a pointer to an area on the disk), or a drive path.

With the BLOB stored in the DB, you have to make a request for it, the DB executes the request, hits the disk for the files containing the data, then send all the data (through the DB) to the requesting app to be streamed to the user. This takes up resources that the DB wasn't meant to do (file transfer).

When you store the file path in the DB, you make a request for the for it, the DB executes the request and returns a string, then the file system can be directly accessed for the contents of the file and streamed to the user. Less IO, more efficient, and you remove the DB from the mix, leaving it to handle other requests quickly.

There may be times though when this is not functionally possible. An example would be if your SQL Servers were separate from your Web Servers. In this case, all of your web servers might not have all of the documents locally (unless you're using a content manager). You may need to send the document from the SQL server to the web server in that case. But that's even more time consuming because it involves the network in a DMZ.

What just came to mind though is that if you can pipe the documents from another server, you're set. I've never tried opening a file through named pipes in ASP though. Sounds like I have a project :)

I could be wrong in all that.. but it just seems logical that you separate what the DB is good at doing from what the FS is good at doing..|||Well, that's what this forum is for, - to share, and disagree at times (which is what I am doing now :) )

I have an app that has a custom report writer built into it (it's not Crystal, nowhere near, but better :) ). Certain users are allowed to create new reports and modify existing ones. Others are allowed to only execute existing reports. The app has its own security that is based on database roles. The fact that I am storing report definitions into text fields allows me not to worry about security. I know-I know, yoall gonna tell me that even if I store definitions into FS the security policy can be made to support the app security. And I would agree with you, maybe :)|||Originally posted by bpdWork
I belive from his last post the whfrank is using Access as the front end, and connecting to a SQL backend over the internet. I haven't played with Access front-ends in a while (thank god!!), so I have no idea how it would deal with BLOBs, or streaming for that matter, but I am sure by now they have been built in. Maybe MS has something about how to use BLOBs through Access, which I am sure will be the limiting factor.

Yes, actually I'm not building a website but a Access-front-end, SQL-back-end application. Our company wants to share some info (it's changing all the time on both sides) with several factories in another country. It's quite simple to use Access Date Project to connect to the SQL Server, you get all tables and queries (same as they are in the SQL server, all LOB types there, text, ntext, image). But all forms, reports, modules (can vary from every user) are bulit in Access and on the remote computers. Then the problem comes out: I get a file path (c:/file.pdf) from DB that is actually the physical location on the server, but for the remote users it's a file path on their own computers... Any idea how to deal with it?|||A few questions need to be answered first.

Are the different locations on the same NT domain, intranet, or share a VPN connection? Or are they communicating over the open internet? This will determine what types of methods are available to you for referencing the file locations.

Perhaps the simplest (not best, but simplest) way of accessing the files, assuming you are going over the open internet, would be FTP or HTTP-based file access. You could store an addess in the db such as http://thefileserver.mycompany.com/file.pdf and shell to it (check out the VBA Shell command, though it may have been replaced by now). That should spawn the user's browser, and, if they have Adobe PDF installed, allow them to view the PDF in the browser, or prompt them to save it if they don't. You would need to set up a simple web server (IIS) to serve up the files (simple).

If the locations are on the same domain, you can protect the file folders with NT Authentication, granting read permissions to a new user group, then adding the people who should have access to that group. If on seperate domains, you can use a local security model, and let everyone use the same account (again, not the trickiest or most secure method, but you could have it running in a day).

If this sounds like what you are after, I'm sure we can expond on it. If not, I'm sure we can help you with another way.|||They are communicating over internet. We have a domain within the company and remote computers can be just thought as individual PCs.

If pointing users to an URL, it's ok for the remote users but seems not so good for the internal users in the domain. One method is to store two paths in the database, one URL and one physical location. Do you think it's feasible?|||Or if you want to get really ugly, you could do web a web server, create a simple ASP page, and stream it from the URL you store in the DB (I know.. I know.. I really don't have a vested interest in streaming, I promise). All I'm saying is that by doing it that way, you don't expose the actual documents.

Your DB would store a list of URLs like http://my.domain.com/getfile.asp?file=myfile.pdf. Then when you shell to the URL as bpd suggested, they will be passed to the site, hit the ASP which could then validate any security requirements you may have, log who viewed it, maybe increment a counter on how many times it's been accessed, etc, then stream the file to them...

If all you want to do is get them the file, no bells and whistles, then this may be to much... go with the simple web server method or the direct file share... but if you have some specialty requirements around this project that goes beyond the scope of getting them the file and providing security, you may need to look to other methods of document retrieval.

Just giving options :)|||Originally posted by whfrank
They are communicating over internet. We have a domain within the company and remote computers can be just thought as individual PCs.

If pointing users to an URL, it's ok for the remote users but seems not so good for the internal users in the domain. One method is to store two paths in the database, one URL and one physical location. Do you think it's feasible?

Why would it not be good to use a URL for both of your users? Do your internal users not have Internet access? If you're communicating over the Internet without a VPN, you'll need to do a web server solution (possibly with SSL if the documents are sensitive). Whether it's pointing to the absolute URL of the document, or to an ASP that streams the document, you'll still need a web server. You do not want to put a computer out on the Internet that has an open share to the file system.

Avoid storing a local path and a URL in the DB if you can. It would be duplication of content.sql

No comments:

Post a Comment