Tuesday, March 15, 2022

Azure Large Block Blob Upload with SAS and REST

Attempting to find an example of uploading a large block blob (and thus breaking into blocks with a finalization using the blocklist api) in Azure using the SAS token approach with the REST endpoint resulted in finding a Javascript example. I worked through the code and turned it into C#. The important parts are that you add the comp and blockid as Base64 encoded strings to the end of the SAS url; and finalize the upload by calling the comp endpoint with an xml section with all the blocks listed.

async Task Main()
{
	// Note: References Nuget Package RestSharp
	var azureSasTokenGetUrl = "";
	var maxBlockSize = 1024 * 1024 * 10; // 10MB block size	
	var fileToUpload = @"";
	var info = new FileInfo(fileToUpload);

	// custom URL for getting sas token accepted a file name value
	var client = new RestClient($"{azureSasTokenGetUrl}?name={info.Name}");
	var response = await client.GetAsync(new RestRequest());
	if (response.IsSuccessful)
	{
		// clean url of any quotes
		var sasUrl = response.Content.Replace("\"", string.Empty);
		Console.WriteLine($"SAS URL: {sasUrl}");

		var blobList = new List();
		var uploadSuccess = false;
		var fileLength = info.Length;
		var numberBlocks = (fileLength % maxBlockSize == 0) ? (int)(fileLength / maxBlockSize) : (int)Math.Floor((double)fileLength / maxBlockSize) + 1;
		client = new RestClient(sasUrl);
		using (var stream = File.OpenRead(fileToUpload))
		{
			for (var i = 0; i < numberBlocks; i++)
			{
				var readSize = (int)(i == (numberBlocks - 1) ? fileLength - (i * maxBlockSize) : maxBlockSize);
				var buffer = new byte[readSize];
				var blockId = Convert.ToBase64String(ASCIIEncoding.ASCII.GetBytes(i.ToString().PadLeft(6, '0')));
				var blockUrl = $"{sasUrl}&comp=block&blockid={blockId}";
				Console.WriteLine($"{i}: {i * maxBlockSize}, length {readSize}");
				blobList.Add(blockId);

				var content = await stream.ReadAsync(buffer, 0, readSize);
				var request = new RestRequest(blockUrl, Method.Put);
				request.AddBody(buffer);
				request.AddHeader("x-ms-blob-type", "BlockBlob");
				response = await client.PutAsync(request);
				uploadSuccess = response.IsSuccessful;
				if (!response.IsSuccessful)
					break;
			}
			stream.Close();
		}

		if (uploadSuccess)
		{
			var finalize = new RestRequest($"{sasUrl}&comp=blocklist");
			finalize.AddStringBody($"<BlockList>{string.Concat(blobList.Select(b => "<Latest>" + b + "</Latest>"))}</BlockList>", DataFormat.Xml);
			response = await client.PutAsync(finalize);
			Console.WriteLine($"Success: {response.IsSuccessful}");
		}
		else
			Console.WriteLine("FAILED");
	}
}

test

Thursday, January 6, 2022

Authoring resource with Luis

Azure LUIS is a great cognitive service that requires two resources to be provisioned in order to work properly. The predictive resource can be created in almost any Azure region, but the authoring resource is limited to specific regions. The authoring resource is where the luis.ai site will save and 'train' your configuration; you'll then publish that configuration and it will be used by the predictive engine as you submit your requests. You can think of these two as training and competition. The authoring is where you lift weights, work on your cardio, practice your forms; the predictive is the applied results of your training. 

Azure functions and SCM_DO_BUILD_DURING_DEPLOYMENT

Deploying Azure Functions via Azure Pipelines, I discovered that if your code is already compiled, you should tell Azure not to build it on deploy. This is done with an app setting SCM_DO_BUILD_DURING_DEPLOYMENT which is set to false. The standard practice is to build your artifacts once, then deploy them to subsequent environments--the same build artifact is deployed to each environment. If your code is already built, without the above setting, you'll get weird errors such as: "Error: Couldn't detect a version for the platform 'dotnet' in the repo." That error indicates it is trying to build your code. Use the above app setting to prevent the build.

Monday, December 6, 2021

Setting up a new work laptop

I started at a new company today and am setting up my new laptop. I thought I would spend a few minutes writing my standard list of software that I install by default, for my future self (and perhaps, others).

Must have software:

For Microsoft Development:
For Git integration:
I also run, as Visual Studio add-ins:
For Azure development:

Wednesday, August 22, 2018

Cosmos Change Feed and Lease Collection Use

The change feed functionality in Azure Cosmos DB is a great feature with a wide variety of uses. The change feed subscribes to changes that occur in a collection and keeps its state in another collection, called the Lease Collection. Because the minimum RUs in a collection are 400 and a single lease hardly uses 50 of the RUs, we had decided to store additional meta data documents in the same lease collection, because we could not use the Continuation Token and Sequence Number data to determine where in the stream the subscriber was (for our own monitoring purposes). Don’t do this! The change feed library will regularly scan and pull all documents in this collection and use it to balance the subscribers and therefore the more documents are in this collection, the greater the RU requirements. Our RUs were bumping against 4,000 when we finally realized what it was doing:

image

After removing our additional metadata documents, the lease collection RUs are much better (note that we have several subscribers using the same lease).

image

Now we store our meta data documents in another collection, which happily sits at around 100 RUs, while leases, as you can see above, hovers around the same.

Bottom line: don’t store anything custom in the Lease Collection of the change feed because it exponentially increases the RU requirements.

Wednesday, July 12, 2017

TFS Hosted Build Controller and .NET 4.7

If you want the TFS hosted build controller to run .NET 4.7 builds, make sure to change the default agent queue to the "Hosted VS2017" version. You can do this by editing the build definition and in the Tasks Process landing page, it is the second option, as shown below.


Sunday, June 4, 2017

Backups, Synology, and the Recycle Bin

I am a big fan of Synology products, owning their 5-bay DS1515+ NAS series (now 1517+) and the RT2600 router. The Synology software is fantastic, user-friendly, and allows additional packages to be installed, further extending the features of the product. Within the last year or so, they've greatly improved their cloud-syncing features, and the CloudSync package provides an easy way to sync files to/from Synology to cloud providers. They also have packages for syncing to other storage options (such as Amazon's Glacier storage).

I had recently configured my NAS drive to backup all local files to Amazon Drive, which offers unlimited storage for $59.99 a year (plus tax). Yes, unlimited! It's a great deal. If you don't have a cloud backup location for your files, Amazon's offering is worth looking into. And they have software that you can install across different platforms so you can sync local files and directories to the cloud.

I was cleaning up the folder structure today in the Synology File Station software and accidentally deleted a root folder (yikes!). I immediately caught the error, paused the sync, but it was too late; some of the files were deleted, both on Synology and on Amazon. Fortunately, Amazon Drive has a "Recycle Bin" so I was able to recover the files. However, this made me enable a feature I had assumed was turned on in Synology: Synology's Recycle Bin. You need to verify yours is turned on too.

Navigate to the Synology Control Panel, choose the Shared Folders icon, select the appropriate folder, choose Edit, and check the "Enable Recycle Bin" option. Now if you do something terrible like deleting an important folder, at least you won't have to wait for hours to pull it back down, if you are syncing to another location.

Two lessons learned:
1. Make sure you have more than one backup. Seriously, buy some space on Amazon or another cloud provider, set up a sync, and make sure it completes. It's important!
2. Make sure your folders have an "undelete" option available.

And just for grins, the files on my Synology that are longer term, unchanging backups, are going to have a third backup location on Amazon's Glacier storage, so that I am covered there. At $0.01/month/GB, it's a cheap option.