Photo Organizer

  • Status Closed
  • Percent Complete
    100%
  • Task Type Feature Request
  • Category Backend / Core
  • Assigned To
    pizza
  • Operating System All
  • Severity Low
  • Priority Defer
  • Reported Version Devel
  • Due in Version Undecided
  • Due Date Undecided
  • Votes 3
  • Private
Attached to Project: Photo Organizer
Opened by pizza - 2006-05-11
Last edited by pizza - 2011-11-23

FS#42 - Add a bulk upload client.

It would be nice, eh?

Closed by  pizza
2011-11-23 15:36
Reason for closing:  Won't implement
Additional comments about closing:  This isn't ever going to happen; it falls well outside my workflow. We can always re-open this later.
Project Manager
Luud commented on 2006-10-02 09:02

Yes, it would be very nice.

As there currently already is the bulk upload from filesystem, it shouldn't be too hard to extend this to be invoked from the command line (right?). We don't need to be all to concerned about security as this would only work if you're logged in locally on the server as the web user (e.g. apache) to be able to run the php cli tool to import the photos. This would also help as the cli tool would not be restricted by a maximum run time or memory usage as is the case (and needed) for the php scripts run through the webserver.

Admin
pizza commented on 2006-10-02 17:53

I'd rather just do the work once; We really need a remote (HTTP) bulk upload cmdline tool, and if anything it's probably simpler as the HTTP request interface doesn't require knowledge of internal data structures.

Would be great to have a checkbox in the bulk upload ui (maybe only when files are on the local disk) to process the files in a background process instead of stucking the browser for ages...
If it start a predefined command and send his log in a specific file somewhere in the webserver filespace, security should not be an issue...

Admin
pizza commented on 2007-08-04 13:12

I also want to be able to process multiple images in parallel, to take advantage of multiple processor cores.

Unfortunately, PHP isn't well suited for non-interactive, non-browser tasks. I haven't dug into it particularly deeply though.

It should be possible to start the php script via a shell command, in background. There are samples in the "system" or "exec" function documentation.

I'm thinking to some kind of background job manager. Each new job (every time consuming operation like adding images or regenerating previews) could be put in a table. When you ask the execution in the background (a checkbox), the job is added to the table, and if the job manager is not started yet, it's launched.

This background process could then do all the jobs after each other, generating logs for each ones, and when there is nothing left to do, terminate itself.

that can be done by user. In this case, all the users can in parallel start jobs, but there is only one per user running at a time.

A specific "My jobs" page could be added to "My tools" pages, whith the status of each requested jobs (Pending, Running, Done, Abnormally terminated), a link to see the log, and the button to delete it from the list...

What do you think ?

Admin
pizza commented on 2007-08-14 14:13

There's a "thread" wrapper out there that might work:

http://codingtheweb.users.phpclasses.org/browse.html/package/1136.html

We can kick off one of these for each CPU.

When a user issues an upload, it'll add an entry in a "job table" for each individual image. A "master job controller" thread will take all submitted jobs and issue up to N simultaneously. The status of each will be recorded and the user can check up on the results.

We have to worry about cleaning up properly -- deleting temp files and directories as necessary.

create table jobs (
identifier integer primary key,
user integer references users(identifier) not null,
done boolean not null;
details text not null,
status text not null
);

details and status would be serialized PHP objects, as there's potentially lots of stuff that needs to go there.

(This really doesn't bring us closer to a bulk upload client, though....)

Admin
pizza commented on 2008-10-27 15:33

A mechanism for parallelizing imports is now mostly complete. It also prevents an effective DoS caused by many import jobs being submitted at the same time.

A remote client for submitting this stuff would still be nice, though.

Old thread, but, while you're at it. maybe add option to schedule the upload. Either by admin and/or user (depending on authorizations of course). Scheduling could both mean uploading to server, but not processing (waiting for scheduled moment), or not uploading at all to server and waiting for scheduled moment on client to upload.

Example 1: i want to upload 350 RAW images, which is rather time consuming. I want to schedule it for midnight. Users workstation needs to stay on for this. On server, processing of upload is immediate.

Example 2: the admin has determined that batch uploads always run at midnight, so that users during any other time cannot upload more than x number of images of y type and/or z size (all configurable, of course).

Example 3: admin has set server batch upload processing to midnight, user loads 100 pics at noon. user can switch off workstation, server will pick up pics at midnight.

Let me know when you would like more explanation, or testing.

Admin
pizza commented on 2011-01-14 18:44

I basically passed the buck on this one, but the ticket's left open as a "that would be nice..." idea, because writing a decent upload client would be a non-trivial task, and I personally have zero use for it. (I just copy the files I care about to the server via rsync or something like that)

Meanwhile, PO already supports background image import/processing. It would be fairly trivial to add a cron job to kick off the processing elements at some arbitrary time, but setting that up is more of a local system administration thing. I personally leave the background workers idling waiting for work.

(I routinely import hundreds of RAW images at a time; I copy the files to a directory on the server, kick off the import, and go to bed -- decoupling the processing from the web server process means I can close the client, and easily utilize multiple CPUs as well. )

"
(I routinely import hundreds of RAW images at a time; I copy the files to a directory on the server, kick off the import, and go to bed -- decoupling the processing from the web server process means I can close the client, and easily utilize multiple CPUs as well. )
"

Good point there !

I'd be interested in the steps you perform to do just that, loading during the night (i mean: do you have a script, can you share it?).

The only thing is that this is usually for the admin only, copying the files to the server. Or, would this also work for a user directory on the server? it would still mean the user would have to have a directory on that server that is large enough to contain all files to be uploaded. For me personally, that is fine. But when you're working with multiple users, it might not be.

Still appreciate it if you could share your steps.

Admin
pizza commented on 2011-01-17 14:49

Everything I use is already part of PO (2.37 or later), but here's the rundown of what I do:

in your config_site.php, set:

$local_bulk_upload = true;

Then via the admin interface, you can specify a bulk upload directory for each individual user to use. This directory should be writable by both the webserver's account and the local user's account.

I just use rsync or scp or netcat to dump the files over to that directory, and on the photo add page, I select the 'local upload' option that's now there.

Meanwhile, to enable the background import stuff, in config_site.php:

$external_workers = TRUE;
$num_of_workers = 4; // I set this to one per core in my server

After this, you'll need to run 'photo.worker.php' from the cmdline -- but as the webserver. You can do this via a terminal, init script, or whatever. Personally I run mine in a screen session as follows:

sudo -u apache php photo.worker.php

But you could just do something like:

su - -u apache nohup php photo.worker.php

...and just let it run forever. Alternatively you could schedule a cron job to kick it off at a certain time. Right now the worker just runs forever, but it could be tweaked to automatically terminate itself when there's no more work to be done.

The basic problem with all of the stuff I just described is that it requires integration with the rest of the system -- and that requires active cooperation on the part of the sysadmin, so it's not just something I can supply in a ready-made script with the PO download.

Thanks a lot Solomon, i really appreciate it!

It's a moot point, since you've already stated you're not going to pursue it, but needing the sys admin is what makes this not so suitable for multiple users (like in a medium size company). If you'd be able to do the setup via the front end, the admin tab, and the upload settings with normal user permissions, then it would really add usability.

Anyway, i'll give your method a try, it looks promising.

Thanks again.

Admin
pizza commented on 2011-01-24 15:19

In any case we'll still need some sysadmin support to set up the background worker(s), but that's a one-off task. But without a purely php/web-based bulk upload tool, I don't see any way to cut the sysadmin out of the loop, because a multi-user system requires the users to be isolated from each other. (I've toyed with the idea of supporting webDAV and an FTP interface, both implemented in pure PHP and supporting both bulk upload and bulk downloads)

In practice, the actual sysadmin involvement is pretty slight, and the appropriate directory/permission stuff can easily be automated -- but the details depend entirely on the configuration/setup of the system, and that's completely outside what PO can provide out of the box. The culprit here is actually the user isolation that a proper multi-user system provides..

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing